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* * Welcome to STN International ********** 

Web Page URLs for STN Seminar Schedule - N. America 

"Ask CAS" for self-help around the clock 

New e-mail delivery for search results now available 

PHARMAMarketLetter (PHARMAML) - new on STN 

Aquatic Toxicity Information Retrieval (AQUIRE) 

now available on STN 

Sequence searching in REGISTRY enhanced 
JAPIO has been reloaded and enhanced 
Experimental properties added to the REGISTRY file 
CA Section Thesaurus available in CAPLUS and CA 
CASREACT Enriched with Reactions from 1907 to 1985 
BEILSTEIN adds new search fields 

Nutraceuticals International (NUTRACEUT) now available on STN 

DKILIT has been renamed APOLLIT 

More calculated properties added to REGISTRY 

CSA files on STN 

PCTFULL now covers WP/PCT Applications from 1978 to date 
TOXCENTER enhanced with additional content 
Adis Clinical Trials Insight now available on STN 
Simultaneous left and right truncation added to COMPENDEX 
ENERGY, INS PEC 

CANCERLIT is no longer being updated 
METADEX enhancements 
PCTGEN now available on STN 
TEMA now available on STN 

NTIS now allows simultaneous left and right truncation 
PCTFULL now contains images 

SDI PACKAGE for monthly delivery of multifile SDI results 
EVENTLINE will be removed from STN 
PATDPAFULL now available on STN 

Additional information for trade-named substances without 
structures available in REGISTRY 
Display formats in DGENE enhanced 
MEDLINE Reload 

Polymer searching in REGISTRY enhanced 

Indexing from 1947 to 1956 being added to records in CA/CAPLUS 

New current-awareness alert (SDI) frequency in 

WPIDS/WPINDEX/WPIX 

RDISCLOSURE now available on STN 

Pharmacokinetic information and systematic chemical names 
added to PHAR 

MEDLINE file segment of TOXCENTER reloaded 

Supporter information for ENCOMPPAT and ENCOMPLIT updated 

CHEMREACT will be removed from STN 

Simultaneous left and right truncation added to WSCA 

RAPRA enhanced with new search field, simultaneous left and 

right truncation 
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April 4 CURRENT WINDOWS VERSION IS V6.01a, CURRENT 
MACINTOSH VERSION IS V6.0b(ENG) AND V6.0Jb(JP), 
AND CURRENT DISCOVER FILE IS DATED 01 APRIL 2 003 
STN Operating Hours Plus Help Desk Availability 
General Internet Information 
Welcome Banner and News Items 

Direct Dial and Telecommunication Network Access to STN 
CAS World Wide Web Site (general information) 



Enter NEWS followed by the item number or name to see news on that 
specific topic. 

All use of STN is subject to the provisions of the STN Customer 
agreement. Please note that this agreement limits use to scientific 
research. Use for software development or design or implementation 
of commercial gateways or other similar uses is prohibited and may 
result m loss of user privileges and other penalties. 
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SINCE FILE 
ENTRY 
0.21 



************* STN Columbus 
FILE 'HOME' ENTERED AT 09:57:22 ON 28 MAY 2003 

=> file medline biosis embase caplus 
COST IN U.S. DOLLARS 

FULL ESTIMATED COST 
FILE 'MEDLINE' ENTERED AT 09:57:38 ON 28 MAY 2003 

FILE 'BIOSIS' ENTERED AT 09:57:38 ON 28 MAY 2003 
COPYRIGHT (C) 2 003 BIOLOGICAL ABSTRACTS INC. (R) 

FILE 'EMBASE' ENTERED AT 09:57:38 ON 28 MAY 2003 

COPYRIGHT (C) 2003 Elsevier Science B.V. All rights reserved. 

FILE 'CAPLUS' ENTERED AT 09:57:38 ON 28 MAY 2003 

USE IS SUBJECT TO THE TERMS OF YOUR STN CUSTOMER AGREEMENT 

PLEASE SEE "HELP USAGETERMS" FOR DETAILS. 

COPYRIGHT (C) 2003 AMERICAN CHEMICAL SOCIETY (ACS) 

=> s liu qingyun /au 

L l 91 LIU QINGYUN 

=> s mcdonald terrence /au 

L2 7 MCDONALD TERRENCE 

=> s 11 and 12 

2 Ll AND L2 

=> d 13 total ibib 



TOTAL 
SESSION 
0.21 



L3 ANSWER 1 OF 
ACCESSION NUMBER; 
DOCUMENT NUMBER: 
TITLE : 



AUTHOR (S) 



CORPORATE SOURCE: 



2 BIOSIS COPYRIGHT 2003 BIOLOGICAL ABSTRACTS INC 
1999:204159 BIOSIS 
PREV199900204159 

Cloning of a novel G-protein-coupled receptor GPR 51 

resembling GABAB receptors expressed predominantly in 

nervous tissues and mapped proximal to the hereditary 

sensory neuropathy type 1 locus on chromosome 9 

Ng, Gordon Y. K. (1); McDonald, Terrence; 

Bonnert, Tim; Rigby, Michael; Heavens, Robert; Whiting 

Paul; Chateauneuf, Anne; Coulombe, Nathalie; Kargman 

Stacia; Caskey, Thomas; Evans, Jilly; O'Neill, Gary P • 

Liu, Qingyun ' 

(1) Department of Biochemistry and Molecular Biology, Merck 



SOURCE : 

DOCUMENT TYPE: 
LANGUAGE : 
SUMMARY LANGUAGE: 



Frosst Center for Therapeutic Research, 16711 TransCanada 
Highway, Kirkland, PQ, H9H 3L1 Canada 

Genomics, (March 15, 1999) Vol. 56, No. 3, pp. 288-295 

ISSN: 0888-7543. 

Article 

English 

English 



L3 ANSWER 2 OF 
ACCESSION NUMBER: 
DOCUMENT NUMBER: 
TITLE : 



AUTHOR (S) 



CORPORATE SOURCE: 



SOURCE : 

PUBLISHER: 
DOCUMENT TYPE: 
LANGUAGE : 
REFERENCE COUNT: 



CAPLUS COPYRIGHT 2 003 ACS 
1999 : 191160 CAPLUS 
131:98163 

Cloning of a Novel G-Protein-Coupled Receptor GPR 51 
Resembling GABAB Receptors Expressed Predominantly in 
Nervous Tissues and Mapped Proximal to the Hereditary 
Sensory Neuropathy Type 1 Locus on Chromosome 9 
Ng, Gordon Y. K. ; McDonald, Terrence; 
Bonnert, Tim; Rigby, Michael; Heavens, Robert; 
Whiting, Paul; Chateauneuf, Anne; Coulombe, Nathalie; 
Kargman, Stacia; Caskey, Thomas; Evans, Jilly; 
O'Neill, Gary P.; Liu, Qingyun 

Department of Biochemistry and Molecular Biology, 

Merck Frosst Center for Therapeutic Research, 

Kirkland, QC, H9H 3L1, Can. 

Genomics (1999), 56(3), 288-295 

CODEN: GNMCEP; ISSN: 0888-7543 

Academic Press 

Journal 

English 

24 THERE ARE 24 CITED REFERENCES AVAILABLE FOR THIS 
RECORD. ALL CITATIONS AVAILABLE IN THE RE FORMAT 



=> s hg51 (s) protein (s) coupled (s) receptor 

L4 3 HG51 (S) PROTEIN (S) COUPLED (S) RECEPTOR 

=> d 14 total ibib kwic 



L4 ANSWER 1 OF 
ACCESSION NUMBER 
DOCUMENT NUMBER: 
TITLE : 



CAPLUS COPYRIGHT 2 003 ACS 
2001:232346 CAPLUS 
135:314242 

Characterization of a novel human opsin gene with wide 
tissue expression and identification of embedded and 
flanking genes on chromosome lq43 

Halford, Stephanie; Freedman, Melanie S.; Bellingham, 
James; Inglis, Suzanne L. ; Poopalasundaram, Subathra'* 
Soni, Bobby G. ; Foster, Russell G. ; Hunt, David M. 
Department of Molecular Genetics, Institute of 
Ophthalmology, University College London, London, EC1V 
9EL, UK 

Genomics (2001), 72(2), 203-208 
CODEN: GNMCEP; ISSN: 0888-7543 
Academic Press 
Journal 
English 

25 THERE ARE 2 5 CITED REFERENCES AVAILABLE FOR THIS 
07 , 1Q1 QO n „ , RECORD . ALL CITATIONS AVAILABLE IN THE RE FORMAT 

273191-92-7, G protein-coupled receptor 
HG51 (human) 

RL: BSU (Biological study, unclassified); PRP (Properties); BIOL 
(Biological study) 

(amino acid sequence; characterization of a novel human opsin gene with 
wide tissue expression and identification of embedded and flankinq 
genes on chromosome lq43) 



AUTHOR (S) 



CORPORATE SOURCE: 



SOURCE : 

PUBLISHER: 
DOCUMENT TYPE: 
LANGUAGE : 
REFERENCE COUNT: 



IT 



L4 ANSWER 2 OF 3 
ACCESSION NUMBER: 
DOCUMENT NUMBER: 
TITLE: 

INVENTOR (S) : 
PATENT ASSIGNEE (S) : 
SOURCE : 



DOCUMENT TYPE: 
LANGUAGE : 

FAMILY ACC. NUM. COUNT: 
PATENT INFORMATION: 



CAPLUS COPYRIGHT 2003 ACS 
2000:513802, CAPLUS 
133 :130801 

Cloning of a novel human G-protein-coupled receptor 
(GPCR) -17723 receptor cDNA and its therapeutic use 
Glucksmann, Maria Alexandra 
Millennium Pharmaceuticals, Inc., USA 
PCT Int. Appl. , 79 pp. 
CODEN: PIXXD2 
Patent 
English 
5 



PATENT 


NO. 




KIND 


DATE 






WO 2000043513 


Al 


20000727 




W: 


. AE, 


AL, 


AM, AT, 


AT, 


AU, 


AZ, 




CU, 


CZ, 


CZ, DE, 


DE, 


DK, 


DK, 




GH, 


GM, 


HR, HU, 


ID, 


IL, 


IN, 




LR, 


LS, 


LT, LU, 


LV, 


MA, 


MD, 




RO, 


RU, 


SD, SE, 


SG, 


SI, 


SK, 




US, 


UZ, 


VN, YU, 


ZA, 


ZW, 


AM, 


RW: 


GH, 


GM, 


KE, LS, 


MW, 


SD, 


SL, 




DK, 


ES, 


FI, FR, 


GB, 


GR, 


IE, 




CG, 


CI, 


CM, GA, 


GN, 


GW, 


ML, 



APPLICATION NO. DATE 

WO 2000-US1592 20000121 



REFERENCE COUNT: 



IT 



D - : US 1999-234923 A 19990121 

6 THERE ARE 6 CITED REFERENCES AVAILABLE FOR THIS 

„, 101 Q ^ nn „ . RECORD . ALL CITATIONS AVAILABLE IN THE RE FORMAT 

273191-92-7P, G Protein-coupled receptor 
HG51 (human) 

RL: BPN (Biosynthetic preparation); BPR (Biological process); BSU 

Snj t* 1C i StU ? Y '. u ^ assif ied > '* PRP (Properties); THU (Therapeutic use); 

BIOL (Biological study); PREP (Preparation); PROC (Process); USES (Uses) 
(amino acid sequence; cloning of novel human G-protein- 
coupled receptor (GPCR) -17723 receptor cDNA 
and its therapeutic use) 



L4 ANSWER 3 OF 3 
ACCESSION NUMBER: 
DOCUMENT NUMBER: 
TITLE: 



INVENTOR (S) : 
PATENT ASSIGNEE (S) 
SOURCE : 



DOCUMENT TYPE: 
LANGUAGE : 

FAMILY ACC. NUM'. COUNT 
PATENT INFORMATION: 



CAPLUS COPYRIGHT 2003 ACS 
2000 :368396 CAPLUS 
133:27864 

Human G protein- coupled 

receptor HG51, its sequence, cDNA 

encoding it, recombinant production and use in methods 
designed to identify agonists and/or antagonists 
Liu, r Qingyun; McDonald, Terrence P. 
Merck & Co., Inc., USA 
PCT Int. Appl. , 68 pp. 
CODEN: PIXXD2 
Patent 
English 
1 



PATENT NO. 



KIND DATE 
Al 20000602 



APPLICATION NO. DATE 
WO 1999-US27305 19991118 



WO 2000031108 

W: CA, JP, US 

RW: AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, 

EP 1133515 Al 20010919 EP 1999-962792 19991118 

R: AT, BE, CH, DE, DK, ES, FR, GB, GR, IT, LI, LU, NL, SE, MC, PT, 



IE, FI 
PRIORITY APPLN. INFO. : 



US 1998-109717P P 19981124 



WO 1999-US27305 W 19991118 
REFERENCE COUNT: 4 THERE ARE 4 CITED REFERENCES AVAILABLE FOR THIS 

RECORD. ALL CITATIONS AVAILABLE IN THE RE FORMAT 
TI Human G protein-coupled receptor 

HG51, its sequence, cDNA encoding it, recombinant production and 
use in methods designed to identify agonists and/or antagonists 
AB The invention provides a cDNA mol . encoding a human G-protein- 
coupled receptor, HG51, as well as the 

receptor encoded by the cDNA mol. The invention also provides an 
expression vector (eukaryotic or prokaryotic) contg. HG51 cDNA mols. and 
host cells transformed with said vector, used for the recombinant prodn. 
of HG51. The invention further provides anti-HG51 antibodies. Still 
further, the invention provides methods for identifying substances that 
bind and/or modulate HG51, which include potential agonists and/or 
antagonists of HG51. The methods are cell based whereby an expression 
vector contg. polynucleotides encoding HG51 is transfected into a host 
cell, allowing for the recombinant prodn. of HG51 prior to addn. of the 
test substance. Finally, the invention provides the cDNA sequence, as 
well as the corresponding amino acid sequence of human HG51. The human 
HG51 receptor was shown to have sequence homol. to the 
rhodopsin subfamily of G protein-coupled 
receptors . 

ST cDNA sequence human G protein coupled receptor 

HG51; recombinant prodn human G protein coupled 

receptor HG51; agonist human G protein 

coupled receptor HG51 method identification; 

antagonist human G protein coupled receptor 

HG51 method identification 
IT G protein-coupled receptors 

RL: BPN (Biosynthetic preparation); BPR (Biological process); BSU 
Biological study, unclassified); BUU (Biological use, unclassified); PRP 

(Properties) ; BIOL (Biological study) ; PREP (Preparation) ; PROC (Process) • 

USES (Uses) 

(HG51; human G protein-coupled 

receptor HG51, its sequence, cDNA encoding it, 

recombinant prodn. and use in methods designed to identify aqonists 
and/or antagonists) 
IT Antibodies 

RL: BSU (Biological study, unclassified); BIOL (Biological study) 
(antibodies specific for human G-protein-coupled 
receptor HG51) 
IT Cell membrane 

(contains recombinant HG51; human G protein- 
coupled receptor HG51, its sequence, cDNA 

encoding it, recombinant prodn. and use in methods designed to identifv 
agonists and/or antagonists) 
IT Genetic vectors 

(eukaryotic or prokaryotic; human G protein-coupled 
receptor HG51, its sequence, cDNA encoding it, 

recombinant prodn. and use in methods designed to identify aqonists 
and/or antagonists) 
IT Molecular cloning 

Tr an s forma t i on , gene tic 

(human G protein-coupled receptor 

HG51, its sequence, cDNA encoding it, recombinant prodn. and 
use in methods designed to identify agonists and/or antagonists) 
IT Ligands 3 

RL: BPR (Biological process); BSU (Biological study, unclassified); BUU 

ttc^c ?? T 1Ca ! US6 ' unclassifi ed) ; BIOL (Biological study); PROC (Process); 
ubES (Uses) 

(of HG51; human G protein-coupled 
receptor HG51, its sequence, cDNA encoding it, 

recombinant prodn. and use in methods designed to identify agonists 
and/or antagonists) 



IT cDNA sequences 

(of cDNA encoding human G protein-coupled 

receptor HG51) 
IT Protein sequences 

(of human G protein-coupled receptor 
HG51) 

IT 273191-92-7P, G Protein- coupled receptor 
HG51 (human) 

RL: BPN (Biosynthetic preparation); BPR (Biological process); BSU 
(Biological study, unclassified); BUU (Biological use, unclassified); PRP 
(Properties); BIOL (Biological study); PREP (Preparation); PROC (Process)* 

USES (Uses) 

(amino acid sequence; human G protein-coupled 
receptor HG51, its sequence, cDNA encoding it, 

recombinant prodn. and use in methods designed to identify agonists 

and/or antagonists) 
IT 273192-12-4D, subfragments are claimed 

RL: BUU (Biological use, unclassified); PRP (Properties); BIOL (Bioloqical 
study) ; USES (Uses) 3 

(nucleotide sequence; cDNA mol . encoding human G-protein- 

coupled receptor HG51, its sequence and use 

in recombinant prodn. of HG51) 
IT 202544-98-7, 15: PN : WO0031108 SEQID : 3 unclaimed DNA 214908-88-0 
214908-89-1 214908-90-4 273192-88-4, 3: PN: WO0031108 SEQID- 4 
unclaimed DNA 273192-89-5, 4: PN: WO0031108 SEQID: 5 unclaimed DNA 
273192-90-8, 5: PN: WO0031108 SEQID: 6 unclaimed DNA 273192-91-9 6- PN- 
WO0031108 SEQID: 7 unclaimed DNA 273192-92-0, 7: PN: WO0031108 SEQID * 8* 
unclaimed DNA 273192-93-1, 8: PN: WO0031108 SEQID: 9 unclaimed DNA 
273192-94-2, 9: PN: WO0031108 SEQID: 10 unclaimed DNA 290797-99-8 
RL: PRP (Properties) 

(unclaimed nucleotide sequence; human G protein- 
coupled receptor HG51, its sequence, cDNA 

encoding it, recombinant prodn. and use in methods designed to identify 
agonists and/or antagonists) 
IT 93050-34-1, Rhodopsin (human protein moiety reduced) 
RL: PRP (Properties) 

(unclaimed protein sequence; human G protein- 
coupled receptor HG51, its sequence, cDNA 

encoding it, recombinant prodn. and use in methods designed to identify 
agonists and/or antagonists) 



09831765Results 
SEQ ID NO: 1 



Result 



% 

Query 



SUMMARIES 





No. 


Score 


Match 


Length DB 


ID 




1 


1470.4 


95 


.7 


2533 


9 


AF303588 




2 


1380.8 


89 


.8 


2110 


9 


AF140242 




3 


943.6 


61 


4 


1737 


10 


AF140241 


c 


4 


515.4 


33 


5 


71872 


9 


AL133390 


c 


5 


424 


27 


6 


5024 


6 


AX281720 


c 


6 


422.4 


27 


5 


5000 


6 


A68713 


c 


7 


422 .4 


27 


5 


5000 


6 


AR106624 


c 


8 


422.4 


27 


5 


5000 


9 


AF056032 




9 


400.4 


26 


1 


449 


6 


AX013137 




10 


358 


23. 


3 


512 


6 


AX062411 


c 


11 


317.2 


20. 


6 


8303 


6 


AX345325 




12 


299.4 


19. 


5 


8303 


6 


AX345324 



Description 



AF303588 Homo sapi 
AF140242 Homo sapi 

AF14 0241 Mus muscu 
AL133390 Human DNA 
AX2 81720 Sequence 
A68713 Sequence 5 
AR106624 Sequence 
AF056032 Homo sapi 
AX013137 Sequence 
AX062411 Sequence 
AX345325 Sequence 
AX345324 Sequence 



RESULT 1 

AF303588 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 



JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 

FEATURES 

source 



gene 
CDS 



AF303588 



2533 bp 



Homo sapiens panopsin (OPN3) mRNA, 
AF303588 

AF303588.1 GI: 13649589 



mRNA linear 
complete cds . 



PRI 17-APR-2001 



human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 2533) 

Half ord, S. , Freedman, M.S. , Bellingham, J . , Inglis, S.L., 
Poopalasundaram,S. , Soni,B.G., Foster, R.G. and Hunt/D^M. 
Characterization of a novel human opsin gene with wide tissue 
expression and identification of embedded and flanking genes on 
chromosome lq43 

Genomics 72 (2), 203-208 (2001) 
21295039 

2 (bases 1 to 2533) 

Halford,S., Bellingham, J . , Freedman, M. S . , Inglis, S.L., 
Poopalasundaram,s. , Foster, R . and Hunt, D.M. 
Direct Submission 

Submitted (05-SEP-2000) Molecular Genetics, Institute of 
Ophthalmology, 11-43 Bath Street, London EC1V 9EL, UK 

Location/Qualifiers 

1. .2533 

/ org anism= "Homo sapiens" 

/db_xref ="taxon: 9606" 

/chromosome= " 1 " 

/map="lq43" 

1. .2533 

/gene="OPN3 M 

103. .1311 

/gene="OPN3 M 

/note= "multiple tissue opsin" 
/codon_start=l 
/product = " panops in " 
/protein_id= "AAK3744 7 . 1" 
/db_xref="GI : 13649590" 

/translation^' MYS GNRS GGHG YWDGGGAAGAEGP A P AGT LS PAPLFS PGTYERL 
ALLLGSIGLLGVGNNLLVLVLYYKFQRLRTPTHLLLVNISLSDLLVSLFGVTFTFVSC 
LRNG WVWDT VG C VWDG F SGSLFGIVSI ATLT VLAYE R Y I R WHARVI NFS WAWRA I T Y 
IWLYSIiAWAGAPLLGWNRYILDVHGI.GrTvnWK'.QK'nBKTnocTrTrT i7T i?T nn T ,nm T 



^m^HVHuiv^uvwuuiJbUHbFGIVSIATLTVLAYERYIRVVHARVINFSWAWRAITY 

IWLYSIiAWAGAPLLGWNRYILDVHGLGCTVDWKSKDANDSSFVLFLFLGCLWPLGVI 

AHCYGHILYSIRMLRCVEDLQTIQVIKILKYEKKLAKMCFLMIFTFLVCWMPYIVICF 

LWNGHGHLVTPTISIVSYLFAKSNTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRP 

AKDLPAAGSEMQIRPI VMSQKDGDRPKKKVTFNSSS I IFI ITSDESLSVDDSDKTNGS 
KVDVIQVRPL" 



BASE COUNT 
ORIGIN 



640 a 592 c 567 g 734 t 



Query Match 95.7%; Score 1470.4; DB 9; Length 2533; 
Best Local Similarity 99.6%; Pred. No. 7.8e-229; 

Matches 1474; Conservative 0; Mismatches e] Indels 0; Gaps 

Qy 58 cagagcaggcggcggagccccagccccacccagtgcggagcgcgccgcgagccccgccgc 117 

Db 1 CGGACGCGTGGGCGGAGC^ 60 

Qy 118 aagctgagcgcctccgcccgccaggcgcgccggcgccgggccatgtactcggggaaccgc 177 

Db 61 AAGCTGAGCGCCTCCGCCCGCCAGGCGCGCCGGCGCCGGGCCATGTACTCGGGGAACCGC 120 

Qy 178 agcggcggccacggctactgggacggcggcggggccgcgggcgctgaggggccggcgccg 237 

Db 121 AGCGGCGGCCACGGCTACTGGGACGGCGGCGGGGCCGCGGGCGCTGAGGGGCCGGCGCCG 180 

Qy 238 gcggggacactgagccccgcgcccctcttcagccccggcacctacgagcgcctggcgctg 297 

Db 181 GCGGGGACACTGAGCCCCGCGCCCCTCTTCAGCCCCGGCACCTACGAGCGCCTGGCGCTG 240 

Qy 298 ctgctgggctccattgggctgctgggcgtcggcaacaacctgctggtgctcgtcctctac 357 

Db 241 CTGCTGGGCTCCATTGGGCTGCTGGGCGTCGGCAACAACCTGCTGGTGCTCGTCCTCTAC 300 

Qy 358 tacaagttccagcggctccgcactcccactcacctcctcctggtcaacatcagcctcagc 417 

Db 301 TACAAGTTCCAGCGGCTCCGCACTCCCACTCACCTCCTCCTGGTCAACATCAGCCTCAGC 360 

Qy 418 gacctgctggtgtccctcttcggggtcacctttaccttcgtgtcctgcctgaggaacggc 4 77 

Db 361 GACCTGCTGGTGTCCCTCTTCGGGGTCACCTTTACCTTCGTGTCCTGCCTGAGGAACGGC 4 20 

Qy 478 tgggtgtgggacaccgtgggctgcgtgtgggacgggtttagcggcagcctcttcgggatt 537 

Db 421 TGGGTGTGGGACACCGTGGGCTGCGTGTGGGACGGGTTTAGCGGCAGCCTCTTCGGGATT 4 80 

Qy 538 ^"j^*"^^^ 597 

Db 4 81 GTTTCCATTGCCACCCTAACCGTGCTGGCCTATGAACGTTACATTCGC^ 54 0 

Qy 598 agagtgatcaatttttcctgggcctggagggccattacctacatctggctctactcactg 657 

Db 541 AGAGTGATCAATTTTTCCTGGGCCTGGAGGGCCATTACCTACATCTGGCTCTACTCACTG 600 

Qy 658 gcgtgggcaggagcacctctcctgggatggaacaggtacatcctggacgtacacggacta 717 

Db 601 GCGTGGGCAGGAGCACCTCTCCTGGGATGGAACAGGTACATCCTGGACGTACACGGACTA 660 

Qy 718 777 

Db 661 GGCTGCACTGTGGACTGGAAATCCAAGGATGCCAACGATTCCTCCT^ 720 

Qy 778 tttcttggctgcctggtggtgcccctgggtgtcatagcccattgctatggccatattcta 837 

Db 721 TTTCTTGGCTGCCTGGTGGTGCCCCTGGGTGTCATAGCCCATTGCTATGGCCATATTCTA 780 

Qy 838 tattccattcgaatgcttcgttgtgtggaagatcttcagacaattcaagtgatcaaga 897 

Db 781 TATTCCATTCGAATGCTTCGTTGTGTGGAAGATCTTCAGACAATTCAAGTGATCAAGATT 84 0 

Qy 898 ttaaaatatgaaaagaaactggccaaaatgtgctttttaatgatattcaccttcctggtc 957 

Db 841 TTAAAATATGAAAAGAAACTGGCCAAAATGTGCTTTTTAATGATATTCACCTTCCTGGTC 900 

Qy 958 tgttggatgccttatat^ 1017 

Db 901 TGTTGGATGC CTT AT ATCGTG ATCTGCTT CTTGGTGGTT AATGGTC ATGGT CACCTGGT ^ 960 



Qy 


1018 


actccaacaatatctattgtttcqtacctctttqctaaatcaaaeartatatar^^nra 


±\) f / 


Db 


961 


IIIMIIIIIIIIIIIIIIINIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIII 

ACTCCAACAATATCTATTGTTTCGTACCTCTTTGCTAAATCGAACACTGTATACAATCCA 


1020 


Qy 


1078 


gtgatttatgtcttcatgatcagaaagtttcgaagatcccttttgcagcttctqtqcctc 


1 1 IT 
J. -Lj / 


Db 


1021 


MMIIIIMIMIIIIIMIIIIIMnillllllMI Illlllllllllll 

GTGATTTATGTCTTCATGATCAGAAAGTTTCGAAGATCCCTTTTGCAGCTTCTGTGCCTC 


1080 


Qy 


1138 


cgactgctgaggtqccaQagqcctqctaaaqacctaccaacaactaaaaatoaaahor'ao 

IIIIIIIIIIIIIIIIMMIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIII 


xj.y / 


Db 


1081 


CGACTGCTGAGGTGCCAGAGGCCTGCTAAAGACCTACCAGCAGCTGGAAGTGAAATGCAG 


1 1 A fl 


Qy 


1198 


stcagacccattgtgatqtcacaqaaaqatqqqqacaqaccaaaaaaaaaaarCTart-i-i-/-' 

IIIIIIMIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIMI 


D / 


Db 


1141 


ATCAGACCCATTGTGATGTCACAGAAAGATGGGGACAGGCCAAAGAAAAAAGTGACTTTC 


l^UU 


Qy 


1258 


aactcttcttccatcatttttatcatcaccaqtqataaatcachotraCTi-i-aarTfar"anr' 

IIIIMIIIIIIIIIIIIIIIIIII IIIIIIIIIIIIIIIIIIIIIIIIMIII 




Db 


1201 


AACTCTTCTTCCATCATTTTTATCATCACCAGTGATGAATCACTGTCAGTTGACGACAGC 


1260 


Qy 


1318 


gacaaaaccaatqqqtccaaaqttqatqtaatccaaattcot-rr | ^^^cf^acIrfaa^-rtaar^a 

IMIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIMIIIIllllllMiiiii 


1 J / / 


Db 


1261 


GACAAAACCAATGGGTCCAAAGTTGATGTAATCCAAGTTCGTCCTTTGTAGGAATGAAGA 




Qy 


1378 


a-tggcaacgaaaqatqqqqccttaaattqaataecarttthaCT^nhhhr'ai-r'ai- a^rra an 

iiiiMHimiiiMmiiiiimiiiiiiiiiiiiiimimiiiiiimii 


1437 


Db 


1321 


ATGGCAACGAAAGATGGGGCCTTAAATTGGATGCCACTTTTGGArTTTrATraTa an a ar 


1380 


Qy 


1438 


tgtctggaatacccgttctatgtaatatcaacagaaccttgtggtccagcaggaaatccq 

1 M M 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


1497 


Db 


1381 


TGTCTGGAATACCCGTTCTATGTAATATCAACAGAACCTTGTGGTCCAGCAGGAAATCCG 


1440 


Qy 


1498 


aattgcccatatgctcttgggcctcaggaagaggttgaac 1537 

iiiiiimiiiiiiiiMiiiiiimimiiiiiiii 




Db 


1441 


AATTGCCCATATGCTCTTGGGCCTCAGGAAGAGGTTGAAC 1480 





RESULT 2 
AF140242 

LOCUS AF140242 2110 bp mRNA linear PRI 27-MAY-1999 

DEFINITION Homo sapiens encephalopsin mRNA, complete cds. 

ACCESSION AF140242 

VERSION AF140242.1 GI : 4894951 

KEYWORDS 

SOURCE human. 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo 
REFERENCE 1 (bases 1 to 2110) 

AUTHORS Blackshaw, S , and Snyder, S.H. 

TITLE Encephalopsin: A novel mammalian extraretinal opsin discretely 

localized in the brain 
JOURNAL J. Neurosci. 19 (10), 3681-3690 (1999) 
MEDLINE 99252448 
PUBMED 10234000 
REFERENCE 2 (bases 1 to 2110) 

AUTHORS Blackshaw,S. and Snyder, S.H. 
TITLE Direct Submission 

JOURNAL Submitted (02-APR-1999) Genetics, Harvard Medical School, 200 
Longwood Ave., Boston, MA 02115, USA 
FEATURES Location/Qualifiers 
source 1. .2110 

/organism="Homo sapiens" 
/ db_xr e f = " t axon : 9 6 0 6 " 
CDS 56. .1267 

/note=" extraretinal opsin" 

/codon_start=l 

/product =" encephalopsin" 

/prot e in_id= " AAD3 2671.1" 

/db_xref="GI : 4894952" 



/translation^ 1 MYSGNRSGGHGYWDGGGAAGAEGPAPAGTLSPAPLFSPGTYERL 
ALLLGS I G LLG VGNNLL VLVL YYKFQRLRT PTHLLL VN I S LSDLL VS LFG VT FT F VS C 
LRNGWVWDTVGCVWDGFSGSLFGIVSIATLTVLAYERYIRVVHARVINFSWAWRAITY 
I WL YS LAWAGAPLLG WNR YI LD VHGLGCT VD WKS KD AND S S F VL F LF LG CL WPLG VI 
AHCYGHILYSIRMLRCVEDLQTIQVIKILKYEKKLAKMCFLMIFTFLVCWMPYIVICF 
LWNGHGHLVTPTISIVSYLFAKSNTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRP 

AKDLPAAGSEMQIRPIVMSQKDGDRPKKKVTFNSSSIIFIITSDESLSVDDSDKTIGV 
QSLMLIQVRPL" 

BASE COUNT 522 a 516 c 480 g 592 t 

ORIGIN 

Query Match 89.8%; Score 1380.8; DB 9; Length 2110; 

Best Local Similarity 98.3%; Pred. No. 2.6e-214; 

Matches 1421; Conservative 0; Mismatches 12 \ Indels 13; Gaps 2; 
Qy 105 cgagccccgccgcaagctgagcgcctccgcccgccaggcgcgccggcgccgggccatgta 164 

llll MIIIIIMIIIIIilllllllMIMMlim 

Db 1 CGAGCCCCGCCGC^AGCTGAGCGCCTCCGCCCGCCAGGCGCGCCGGCGCCGGGCCATGTA 60 

Qy iss f<****™^^ 

120 



- - jj 333»«33»-33v,MHHMUi-y^yuuLHL;mel 224 

C1 JJIJIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 

Db SI CTCGGGGAACCGCAGCGGCGGCCACGGCTACTGGGAC^ 



Qy 225 ggggccggcgccggcggggacactgagccccgcgcccctcttcagccccggcacctacga 284 

IIIIMIIIIIIIIIIIIIMllMlllllllllllllllllllliiiiMiiiiHiii 

Db 121 GGGGCCGGCGCCGGCGGGGACACTGAGCCCCGCGCCCCTCTTCAGCCCC(^(^CCTACGA 180 



Qy 285 



???T?if????i?n??i??? ctccatt999Ct9Ct999C9tC99caacaacct9Ct99t 



344 



' I I I I I I I I I I II I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |] | , 
Db 181 GCGCCTGGCGCTGCTGCTGGGCTCCATTGGGCTGCTGGGCGTCGGCAACAACCTGCTGGT 240 

Qy 345 gctcgtcctctactacaagttccagcggctccgcactcccactcacctcctcctggtcaa 404 

MMIMMMMMIMMIMIIIIIMMMIMMMIMMMMMMIIMM 

Db 241 GCTCGTCCTCTACTACMGTTCCAGCGGCTCCGCACTCCI^CTC^CCTCCTCCTCGTC^A 300 

Qy 405 catcagcctcagcgacctgctggtgtccctcttcggggtcacctttaccttcgtgtcctg 464 

nh IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIII 

Db 301 CATCAGCCTCAGCGACCTGCTGGTGTCCCTCTTCGGGGTOICCTTTACCTTCGTGTCCTG 360 

Qy 465 °«gaggaacggct 9 ggtgtgggacaccgtgggctgcgtgtgggacgggtttagcggcag 524 

nh , e ll'llllllllllllllllllllllllllll IIIIIIIIIIIIIIIIHIIIIIIIIII 

Db 361 CCTGAGGAACGGCTGGGTGTCWGA(^CCGTGGGCTGCGTGTGGGACGGGTTTAGCGGCAG 420 

* 7 " it u i???Mfftnn mTfn TTTTTTffTTT nffintffmTnnini? 58 ' 

Db 421 CCTCTTCGGGATTGTTTCCATTGCCACCCTAACCGTGCTGGCCTATGAACGTTOCATTCG 480 
Qy 585 cgtggtccatgccagagtgatcaatttttcctgggcctggagggccattacctacatctg 644 

c MMMMIIMIMIMIIMIIMIIIMIMMMMIMMMMIIMIMIMI 

Db 481 CGTGGTCCATGCCAGAGTGATCAATTTTTCCTGGGC^ 540 
Qy 645 gctctactcactggcgtgggcaggagcacctctcctgggatggaacaggtacatcctgga 704 

MMIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 

Db 541 GCTCTACTCACTGGCGTGGGCAGIGAGCACCTCTCCTGGGATC 600 

Qy 705 cgtacacggactaggctgcactgtggactggaaatccaaggatgccaacgattcctcc 764 

Db 601 Jllil JliJiii iiiliiiliiiillllJ liilJ J J 1 1 J I I I I I I I || | | | | | | | ■ 



CGTACACGGACTAGGCTGCACTGTGGACTGGAAATCCAAGGATGCCAACGATTCCTCCTT 



660 



Qy 

nh ^dii 1 1 1 1 1 1 1 1 1 1 Mil 1 1 1 1 III III 1 1 1 Ml I ill Ml Ml 1 1 1 IN 1 1 1 1 1 1 

Db 661 TGTGCTTTTCTTATTTCTTGGCTGCCTGGTGGTGCCCCTGGGTGTCATAGCC^TOGCTA 



765 ^9cttttcttatttcttggctgc^ 824 

720 

M i M M I II 1 1 II 1 1 1 1 II II II II 1 1 1 1 II II 1 1 II II II I M 

TATATTCCATTCGAATGCTTCGTTGTGTGGAAGATCTTCAGACAA' 

Qy 885 agtgatcaagattttaaaatatgaaaagaaactggccaaaatgtgctttttaatgatatt 944 

M M I II 1 1 1 II II II II I II I || | II 1 1 1 II 1 1 1 1 II 1 1 II II 1 1 1 1 II 1 1 

Db 781 AGTGATCAAGATTTTAAAATATGAAAAGAAACTGGCCAAAATGTC 840 



Qy 825 tggccatattctatattccattcgaatgcttcgttgtgtggaagatcttcagacaattca 884 

1 1 1 M M M I II I I II M I I || II I II II II I II I II II II II II II I II II II 

Db 721 TGGCCATATTCTATATTCCATTCGAATGCTTCGTTGTGTG^ 780 



Qy 


945 


caccttcctggtctgttggatgccttatatcgtgatctgcttcttgqtggttaatqqtca 

mimiiimiiiiiimmimmimminiiiimmimiimmiiimiimi 


1004 


Db 


841 


CACCTTCCTGGTCTGTTGGATGCCTTATATCGTGATCTGCTTCTTGGTGGTTAATGGTCA 


900 


Qy 


1005 


tggtcacctggtcactccaacaatatctattgtttcgtacctctttgctaaatcgaacac 

illllllll Mill IIIIIIIINIIIII lllll IIIIIMM IMIIIIIIiMIMII 


1064 


Db 


901 


TGGTCACCTGGTCACTCCAACAATATCTATTGTTTCGTACCTCTTTGCTAAATCGAACAC 


960 


Qy 


1065 


tgtatacaatccagtgatttatgtcttcatgatcagaaagtttcgaagatcccttttqca 

IIINIIIIIMIMIMIIIIIIIMMIMIIIIIIIIIIMIIIIIMIIIIIIIII 


1124 


Db 


961 


TGTATACAATCCAGTGATTTATGTCTTCATGATCAGAAAGTTTCGAAGATCCCTTTTGCA 


1020 


Qy 


1125 


gcttctgtgcctccgactgctgaggtgccagaggcctgctaaagacctaccaqcaqctqq 

llinilllllllllllllllllllllllMMIMIIIIIIiriMIIIIIMIMIII 


1184 


Db 


1021 


GCTTCTGTGCCTCCGACTGCTGAGGTGCCAGAGGCCTGCTAAAGACCTACCAGCAGCTGG 


1080 


Qy 


1185 


aagtgaaatgcagatcagacccattgtgatgtcacaqaaaqatqqqqacaaqccaaaaaa 


1244 


Db 


1081 


iiiiiiiliiiU 1 1 ' 1 1 1 1 1 1 1 1 II II 1 II 1 1 1 1 II 1 1 II 1 II II II Ml MM IN || 

AAGTGAAATGCAGATCAGACCCATTGTGATGTCACAGAAAGATGGGGACAGGCCAAAGAA 


114 0 


Qy 


1245 


aaaagtgactttcaactcttcttccatcatttttatcatcaccagtgatgaatcactatc 

IIIIIIIIIIMIIIIIIIIIIIIIIIIIIMIIMIIMIIIIIIIIIIMIMlllll 


i in/ 


Db 


1141 


AAAAGTGACTTTCAACTCTTCTTCCATCATTTTTATCATCACCAGTGATGAATCACTGTC 


1200 


Qy 


1305 


agttgacgacagcgacaaaaccaatgggtccaaa gttqatqtaatccaaqttcqtcc 




Db 


1201 


MIIIIIIMIMM Ill Mil III I IMMMIIIIIMI 

AGTTGACGACAGCGACAAAACCATTGGGGTCCAAAGTTTGATGTTAATCCAAGTTCGTCC 


1260 


Qy 


1362 


tttgtaggaatgaagaatggcaacgaaagatggggccttaaattggatgccacttttqqa 

IMIIIIIMIIIII 1 1 1 M 1 II 1 1 II 1 IIIIIIIIIIIMIIIIIIIIIIIIIIIII 


i^zi 


Db 


1261 


TTTGTAGGAATGAAGGATGGCAACGAAAGGTGGGGCCTTAAATTGGATGCCACTTTTGGA 


1320 


Qy 


1422 


Ililllllll M 1 II 1 II 1 1 1 1 1 1 1 1 M 1 M 1 II 1 1 II 1 1 1 1 1 1 1 II II 1 


14 71 


Db 


1321 


CTTT CATCAT CCTC C TGAAGAAGAAfiTfiTP TP r a a t a nnn r> tt a tv m * * ov-i* tIA*A 
* * vwu-ivji-irt\j,firt^ luiLi vjbAA i AL.L-L.vj 1 1 L.TATGTAATATCAACAG 


1380 


Qy 


1472 


aaccttgtggtccagcaggaaatccgaattgcccatatgctcttgggcctcaggaaqaqq 

iMNHIIIIMIMIIIIIIIIIIIIIIIIIillliiiiiiiiii 

AACCTTGTGGTCCAGCAGGAAATCCGAATTGCCCATATGCTCTTGGGCCTCAGGAAGAGG 


1531 


Db 


1381 


1440 


Qy 


1532 


ttgaac 1537 




Db 


1441 


nun 

TTGAAC 1446 





Result 
No. 



% 

Query 



SUMMARIES 



Score 


Match Length DB 


,ID 


1537 


100 


.0 


1537 


21 


AAA38861 


1459 


94 


.9 


2144 


21 


AAA73212 


1373.2 


89 


.3 


2037 


22 


AAD19720 


1028 


66 


.9 


1697 


22 


AAF33051 


687 .8 


44 


7 


1267 


21 


AAZ34604 


644 .4 


41 


9 


1763 


21 


AAC69518 


437 


28 


4 


619 


22 


AAD19721 


435 


28 


3 . 


12291 


22 


AAK79265 


424 


27 


6 


5024 


24 


AAS94874 


422.4 


27. 


5 


5000 


19 


AAV20609 


400.4 


26. 


1 


449 


20 


AAZ42057 



Description 



2 
3 
4 
5 
6 
7 
8 
9 
10 
11 



Human G- protein co 
Human 17723 recept 
Dendritic cell (DC 
Human secreted pro 
Human receptor mol 
Human secreted pro 
Dendritic cell (DC 
Human immune/haema 
Human DNA sequence 
Human kynurenine-3 
Human endometrium 



AAA38861 

ID AAA38861 standard; cDNA ; 1537 BP. 



XX 
AC 
XX 



AAA38861; 



DT 31-AUG-2000 (first entry) 
XX 

DE Human G-protein coupled receptor, HG51, coding sequence. 

KW Human; G-protein coupled receptor; HG51; signal transduction; 
KW rhodopsin receptor; obesity; type II diabetes; 

KW inflammatory bowel disease; constipation; diarrhoea; gene therapy; ss. 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 160.. 1368 

FT /*tag= a 

FT /products "Human HG51" 

XX 

PN WO200031108-A1. 
XX 

PD 02-JUN-2000. 
XX 

PF 18-NOV-1999; 99WO-US27305 . 
XX 

PR 24-NOV-1998; 98US-0109717 . 
XX 

PA (MERI ) MERCK & CO INC. 
XX 

PI Liu Q, McDonald TP; 
XX 

DR WPI; 2000-400025/34. 

DR P-PSDB; AAY98008. 
XX 

PT New DNA encoding human HG51 {a G-protein coupled receptor) , useful in 

PT chromosomal mapping studies for identifying the chromosomal locations 

PT of the HG51 gene{s) - 
XX 

PS Claim 1; Fig 1; 68pp; English. 
XX 

CC G protein-coupled receptors (GPCR) are important in signal transduction 

CC from the exterior to the interior of cells. Rhodopsin receptors are a 

CC type of GPCR which comprise a chromophore -binding pocket which is 

CC covalently linked by a protonated Schif f . base to a Lys residue in 

CC transmembrane domain 7. The present sequence is the coding sequence of 

CC the human HG51 GPCR and is a member of the rhodopsin receptor family of 

CC GPCRs. Due to the Lys residue and Schif f base present in HG51, it is 

CC thought that the HG51 ligand may be a fatty-acid-like molecule. It is 

CC also believed that agonists and antagonists of HG51 are useful for 

CC treating various disorders such as obesity, type II diabetes, 

CC inflammatory bowel disease, constipation or diarrhoea. In addition, the 

CC present sequence may be used in gene therapy for the above mentioned 

CC disorders. 

XX 

SQ Sequence 1537 BP; 320 A; 426 C; 421 G; 370 T; 0 other; 

Query Match 100.0%; Score 1537; DB 21; Length 1537; 

Best Local Similarity 100.0%; Pred. No. 0; 



Matches 


1537; Conservative 0; Mismatches 0; Indels 0; Gaps 


Qy 


i 


ggggccacggggggtgcgccggcgcgcggtagcgcgggcccctcagtgcacaatggccag 

lllillllMllllllilllllllllllllllllllllllllllMIMIIIIIIIIHI 


60 


Db 


i 


ggggccacggggggtgcgccggcgcgcggtagcgcgggcccctcagtgcacaatggccag 


60 


Qy 


61 


mTTTTTTTT?TTTTTTTTTmTTmTTTTTTrfT?nTrnTTTTTrT?nTTnT 


120 


Db 


61 


a gcaggcggcggagccccagccccacccagtgcggagcgcgccgcgagccccgccgcaag 


120 


Qy 


121 


ctgagcgcctccgcccgccaggcgcgccggcgccgggccatgtactcggggaaccgcagc 

IIHIIIMIIIIIIIMIIIIMIIIIIilllllllllllllllllliiiiMiiiiii 


180 


Db 


121 


ctgagcgcctccgcccgccaggcgcgccggcgccgggccatgtactcggggaaccgcagc 


180 


Qy 


181 


ggcggccacggctactgggacggcggcggggccgcgggcgctgaggggccggcgccggcg 


240 



Db 181 ggcggccacggctactgggacggcggcggggccgcgggcgctgaggggccggcgccggcg 240 

Qy 241 999^actgagcc^ 300 

Db 241 gggacactgagccccgcgcccctcttcagccccggcacctacgagcgcctggcgctgctg 300 

Qy 301 ^^^ a "^^ 360 

Db 301 ctgggctccattgggctgctgggcgtcggcaacaacctgctggtgctcgtcctctactac 360 

Qy 361 ^a^fc fc ccagc^^c t cc^c^c t <j=<z=c=a<z=t <j=a<== <= t <== c fc c <z: tr c=aacat ca^c= cfc c= a^c^ac 420 

Db 361 aagttccagcggctccgcactcccactcacctcctcctggtcaacatcagcctcagcgac 420 

Db 421 ctgctggtgtccctcttcggggtcacctttaccttcgtgtcctgcctglggiiiggitgg 480 

Qy 481 gtgtOTgacaccgtgg^ 540 

Db 4 81 gtgtgggacaccgtgggctgcgtgtgggacgggtttagcggcagcctcttcgggittgtt 540 

Qy 541 ^*"^ a ^^ 600 

Db 541 tccattgccaccctaaccgtgctggcctatgaacgttacattcgcgtggtccaUciigi 600 

Qy 601 9tgatcaatttttcc^ 660 

Db 601 gtgatcaatttttcctgggcctggagggccattacctacatctggctctactcactggcg 660 

Qy 661 jj^^^ a ^ 720 

Db 661 tgggcaggagcacctctcctgggatggaacaggtacatcctggacgtacacggactaggc 720 

Qy 721 ^ a ^j|| a ^ 780 

Db 721 tgcactgtggactggaaatccaaggatgccaacgattcctc^^ 780 

Db 781 cttggctgcctggtggtgcccctgggtgtcatagcccattgctatggccatittitatit 84 0 

Qy 841 tccattcgaatgcttcgttgtgtggaagatcttcagacaattcaagtgatcaagatttta 900 

Db 841 tccattcgaatgcttcgttgtgtggaagatcttcagacaattcaagtgatcaagatttta 900 

Qy 901 ^j^^^ 960 

Db 901 aaatatgaaaagaaactggccaaaatgtgctttttaatgatattcaccttcctggtctgt 960 

Qy 961 tggatgcctta^ 1Q20 

Db 961 tggatgccttatatcgtgatctgcttcttggtggttaatggtcatggtcacctggtcact 1020 

Qy 1021 ccaacaatatctattgtttcgtacctctttgctaaatcgaacactgtatacaatccagtg 1080 

Db 1021 ccaacaatatctattgtttcgtacctctttgctaaatcgaacactgtatacaatccagtg 1080 

Qy 1081 atttatgtcttcatgatcagaaagtttcgaagatcccttttgcagcttctgtg^ 1140 

Db 1081 atttatgtcttcatgatcagaaagtttcgaagatcccttttgcagcttctgtgcctccga 1140 

Qy 1141 ctgctgaggtgccagaggcctgctaaagacctaccagcagct i 20 o 

Db 1141 ctgctgaggtgccagaggcctgctaaagacctaccagcagctggaagtgaaatgcagatc 1200 

Qy 1201 agacccattgtgatgtcacagaaagatggggacaggccaaagaaaaaagtgactttcaa 1260 

Db 1201 agacccattgtgatgtcacagaaagatggggacaggccaaagaaaaaagtgactttcaac 1260 

Qy 1261 tcttcttccatcatttttatcatcaccagtgatgaatcactgtcagttgacgacagcgac 1320 



Db 


1261 


Hv 

wy 




Db 


1321 




1 101 
IJOl 


Db 


1381 


Qy 


1441 


Db 


1441 


Qy 


1501 


Db 


1501 



IMIMIIIIMI IIIIIIIIIMIIM llllllllll lllllllllllll lllllllll 

tcttcttccatcatttttatcatcaccagtgatgaatcactgtcagttgacgacagcgac 
aaaaccaatgggtccaaagttgatgtaatccaagttcgtcctttgtaggaatgaaqaatq 

IIIIIIIMMIIMIIIMIIIIIIIIIIIIIMIIIIIIIIIIMIIIIIIIIIIIII 

aaaaccaatgggtccaaagttgatgtaatccaagttcgtcctttgtaggaatgaagaatg 
gcaacgaaagatggggccttaaattggatgccacttttggactttcatcataaqaaqtqt 

MMMMMUIMIIMIIIIIIIIIIII 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 

gcaacgaaagatggggccttaaattggatgccacttttggactttcatcataagaagtgt 
ctggaatacccgttctatgtaatatcaacagaaccttgtggtccagcaggaaatccgaat 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMlllliiniiii 

ctggaatacccgttctatgtaatatcaacagaaccttgtggtccagcaggaaatccgaat 
tgcccatatgctcttgggcctcaggaagaggttgaac 1537 

IIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIllM 



RESULT 2 
AAA73212 

ID AAA73212 standard; cDNA; 2144 BP. 
XX 

AC AAA73212; 
XX 

DT 05-DEC-2000 (first entry) 
XX 
DE 
XX 
KW 
KW 
XX 

OS Homo sapiens. 
XX 

PN WO200043513-A1. 
XX 

PD 27-JUL-2000. 
XX 

PF 21-JAN-2000; 2000WO-US01592 . 
XX 

PR 21-JAN-1999; 99US- 0234923 . 
XX 

PA {MILL- ) MILLENNIUM PHARM INC. 
XX 

PI Glucksmann MA; 
XX 

DR WPI; 2000-476196/41. 
DR P-PSDB; AAB12827. 
XX 

PT A G-protein-coupled receptor designated 17723 and the nucleic acids 

P"P <-Vi=>+- ."4- _ n r 



380 



500 



Human 17723 receptor protein encoding cDNA SEQ ID NO: 2. 

Human; 17723 receptor protein; chromosome lq42-44; diagnosis; vaccine; 
G-protein coupled receptor; gene therapy; ss. 



_ >- j — u.*^ tuc uu^iciu auiub 

that encode it, useful for preventing, diagnosing and treating disorder 

PT associated with inappropriate expression of 17723 receptors - 

PS Claim 3; Page 72-73; 79pp; English. 
XX 

CC The present sequence encodes the human 17723 receptor protein (I) which 

CC belongs to the superfamily of G-protein-coupled receptors. (I) and the 

CC polynucleotide encoding it may be used in the prevention, treatment and 

CC diagnosis of diseases associated with inappropriate 17723 receptor 

CC expression. They may also be used to study the expression and function 

CC of 17723 receptor polypeptides and their role in metabolism. The 17723 

CC receptor polypeptides may be used as antigens in the production of 

CC antibodies against 17723 receptors and in assays to identify modulators 

CC (agonists and antagonists) of 17723 receptor expression and activity 

CC The anti-17723 receptor antibodies and 17723 receptor antagonists may be 

CC used to down regulate 17723 receptor expression and activity. The 

™ anti-17723 receptor antibodies may also be used as diagnostic agents for 

CC detecting the presence of 17723 receptor polypeptides in samples 

CC (e.g. by enzyme linked immunosorbent assay (ELISA) ) . The 17723 receptor 

CC protein has been mapped to chromosome lq42-44. 



XX 

SQ Sequence 2144 BP; 525 A; 531 C; 496 G; 590 T; 2 other; 

Query Match 94.9%; Score 1459; DB 21; Length 2144; 
Best Local Similarity 99.9%; Pred. No, 0; 

Matches 1470; Conservative 0; Mismatches 0; Indels 1; Gaps 

Qy 67 cggcggagccccagcccc^ 126 

Db l cggcggagccccag-cccacccagtgcggagcgcgccgcgagccccgccgcaagctgagc 59 

Qy 127 IJj^^j^j^ 186 

Db 60 gcctccgcccgccaggcgcgccggcgccgggccatgtactcggggaaccgcagcggcggc 119 

Qy 187 ^^j^^^^ 246 

Db 120 cacggctactgggacggcggcggggccgcgggcgctgaggg 179 

Qy 24 7 j^^^j^^^ 306 

Db 180 ctgagccccgcgcccctcttcagccccggcacctacgagcgcctg^ 239 

Qy 307 ^^"^j 3 ^^ 366 

Db 240 tccattgggctgctgggcgtcggcaacaacctgctggtgctcgtcctctactacaagttc 299 

Qy 367 j^^l^jy^ 426 

Db 300 cagcggctccgcactcccactcacctcctcctggtcaacatcagcctcagcgicctgctg 359 

Qy 427 gtgtccctcttcggggtcacctttaccttcgtgtcctgcctgaggaacggctgggtgtgg 486 

Db 360 gtgtccctcttcggggtcacctttaccttcgtgtcctgcctgaggaacggctgggtgtgg 419 

Qy 487 gacaccgtgggctgcgtgtgggacgggtttagcggcagcctcttcgggattgtttccatt 546 

Db 420 gacaccgtgggctgcgtgtgggacgggtttagcggcagcctcttcgggattgtttccatt 479 

Qy 547 ^ ^^^-^ T^^Y 1 T"T"T TT"T^"T T^^T T T T"T T'T"T t T !J T'T T T"T T"T T T?"T H T r T^?'T c fc ^^-s-^sts-^*^ «= eoe 

Db 480 gccaccctaaccgtgctggcctatgaacgttacattcgcgtggtccatgccagagtgatc 539 

Qy 607 666 

Db 540 aatttttcctgggcctggagggccattacctacatctggctctactcactggcgtgggca 599 

Qy 667 ggagcacctctcctgggatggaacaggtacatcctggacgtacacggactaggctgcact 726 

Db 600 ggagcacctctcctgggatggaacaggtacatcctggacgtacacggactaggctgcact 659 

Qy 727 ^^j^^ 786 

Db 660 gtggactggaaatccaaggatgccaacgattcctcctttgtgctt^ 719 

Qy 787 tgcctggtggtgcccctgggtgtcatagcccattgctatggccatattctatattccatt 846 

Db 720 tgcctggtggtgcccctgggtgtcatagcccattgctatggccatattctatattccatt 779 

Qy 847 cgaatgcttcgttgtgtggaagatcttcagacaattcaagtgatcaagattttaaaatat 906 

Db 780 cgaatgcttcgttgtgtggaagatcttcagacaattcaagtgatcaagattttaaaatat 839 

Qy 907 gaaaagaaactggccaaaatgtgctttttaatgatattcaccttcctggtctgttggatg 966 

Db 840 gaaaagaaactggccaaaatgtgctttttaatgatattcaccttcctggtctgttggatg 899 

Db 900 ccttatatcgtgatctgcttcttggtggttaatggtcatg^ 959 



Qy 1027 atatctattgtttcgtacctctttgctaaatcgaacactgtatacaatccagtgatttat 1086 

c IMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIMIIIMIIII 

960 atatctattgtttcgtacctctttgctaaatcgaacactgtatacaatccagtgatttat 1019 



Db 



Qy 



1087 gtcttcatgatcagaaagtttcgaagatcccttttgcagcttctgtgcctccgactqctq 1146 

1050 '''''"'"l HIIIIIIIIIIIIII Illllllllllllllllllllll 

1020 gtcttcatgatcagaaagtttcgaagatcccttttgcagcttctgtgcctccgactgctg 1079 
Qy 1147 aggtgccagaggcctgctaaagacctaccagcagctggaagtgaaatgcagatcagaccc 1206 

inftft N HI llll HUH II M M II II II I HIIIIIIIIIIIIII HIIIIIIIIIIIIII 

1080 aggtgccagaggcctgctaaagacctaccagcagctggaagtgaaatgcagatcagaccc 1139 
Qy 1207 attgtgatgtcacagaaagatggggacaggccaaagaaaaaagtgactttcaactcttct 1266 

114n 1 1 1 IN 1 1 IN 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 

1X4U attgtgatgtcacagaaagatggggacaggccaaagaaaaaagtgactttcaactcttct 1199 



Db 



Db 



Db 



Qy 1267 tccatcatttttatcatcaccagtgatgaatcactgtcagttgacgacagcgacaaaacc 1326 

1200 iii^iilii Jil ■■■■■■■■■■■■■■■■■■■■■IIIIIIIIIIIIIIIIIIIIIIIIIII 
1200 tccatcatttttatcatcaccagtgatgaatcactgtcagttgacgacagcgacaaaacc 1259 



Db 



Qy 



Db 



Qy 



Db 



Qy 



Db 



Qy 



Db 



1327 aatgggtccaaagttgatgtaatccaagttcgt 1386 

19 , n llllll IIIIIIIIIIIIIMIMIIIIIIMIIIMIIIIIMIMIIIMIIIMII 

1260 aatgggtccaaagttgatgtaatccaagttcgtcctttgtaggaatgaagaatggcaacg 1319 
1387 aaagatggggccttaaattggatgccacttttggactttcatcataagaagtgtctggaa 1446 

119n IIIIIIIIIIIIMIMII IMIIIMMIIIIMIIIIIIIIMIIINIIIII 

1320 aaagatggggccttaaattggatgccacttttggactttcatcataagaagtgtctggaa 13 79 
1447 tacccgttctatgtaatatcaacagaaccttgtggtccagcaggaaatccgaattgccca 1506 

1380 I lllll lllll 1 ^^IIIIIIINIIIIIIIMIIIIIIMIIMIIIIIIIIMII 

1380 tacccgttctatgtaatatcaacagaaccttgtggtccagcaggaaatccgaattgccca 1439 
1507 tatgctcttgggcctcaggaagaggttgaac 1537 

HilllllllMMi III MM, !.' 

1440 tatgctcttgggcctcaggaagaggttgaac 1470 



SUMMARIES 



Result 



% 

Query 



No. 


Score 


Match 


Length DB 


ID 


1 


422.4 


27 


.5 


5000 


3 


US-09-147-522-5 


2 


107.6 


7 


.0 


3016 


1 


US-07-805-123C-1 


3 


107.6 


7 


.0 


3016 


1 


US-08-033-081B-1 


4 


78 


5 


.1 


1105 


2 


US-08-466-103A-15 


5 


73 


4 


.7 


1410 


4 


US-09-255-368-1 


6 


69.2 


4 


5 


1420 


1 


US-08-358-171-1 


7 


69.2 


4 


5 


1420 


3 


US-09-090-947-1 


8 


67.2 


4 


4 


1293 


4 


US-09-255-368-7 


9 


67 


4 


4 


1776 


1 


US-08-722-001-29 


10 


65.4 


4 


3 


2140 


1 


US-08-334-698-1 


11 


65.4 


4 . 


3 


2140 


1 


US-08-228-932-1 


12 


65.4 


4 . 


3 


2140 


1 


US-08-468-939-1 



Description 



Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 



5, Appli 
1, Appli 
1, Appli 
15, Appl 
1, Appli 
1, Appli 
1, Appli 
7, Appli 
29, Appl 
1, Appli 
1, Appli 
1, Appli 



Result 




Query 








No. 


Score 


Match 


Length DB 


ID 


1 


660.8 


43.0 


789 


10 


BI818538 


2 


644.8 


42.0 


770 


10 


BI260681 


3 


609.8 


39.7 


909 


10 


BE894106 


4 


580.8 


37.8 


736 


10 


BI086726 


5 


577.2 


37.6 


835 


10 


BF970560 


6 


575 


37.4 


850 


10 


BI757207 


7 


565.4 


36.8 


74 8 


lb 


BG252201 


8 


515,8 


33.6 


741 


10 


BG564220 


9 


467.8 


30.4 


788 


10 


BF977798 


10 


461.8 


30.0 


784 


10 


BI758685 



Description 



BI818538 
BI260681 
BE894106 
BI086726 
BF970560 
BI757207 
BG252201 
BG564220 
BF977798 
BI758685 



603033059 
602968193 
601438234 
602850078 
602274056 
603030709 
602365072 
602586010 
602148633 
603024224 



11 


426.8 


27 


8 


631 


9 


BB640431 


12 


426 


27 


7 


819 


10 


BI088684 


13 


423 


27 


5 


424 


10 


BM194008 


14 


406.8 


26 


5 


742 


10 


BI257225 


15 


398.4 


25 


9 


615 


10 


BF132059 



BB640431 BB640431 
BI088684 602851458 
BM194008 TCAAP1E64 
BI257225 602976885 
BF132059 601821062 



SEQ ID NO: 2 



SUMMARIES 



Result 




Query- 










Score 


Match 


Length DB 


ID 


1 


2117 


100 


.0 


402 


21 


AAB12827 


2 


2117 


100 


.0 


402 


21 


AAY98008 


3 


2105 


99 


.4 


402 


22 


AAE12070 


4 


1063 


50 


.2 


199 


22 


AAB64743 


5 


756 


35 


.7 


, 147 


21 


AAY32195 


6 


664 


31 


4 


163 


22 


AAE12071 


7 


664 


31 


4 


879 


22 


AAU31008 


8 


572 


27 


0 


123 


20 


AAY60172 


9 


564 


26 


6 


122 


21 


AAB38327 


10 


459.5 


21 


7 


349 


10 


AAP90554 


11 


455 


21 


5 


348 


17 


AAR93116 


12 


451 


21. 


3 


348 


21 


AAY98009 


13 


449 


21. 


2 


348 


14 


AAR38483 


14 


424 


20. 


0 


354 


21 


AAY57086 


15 


420.5 


19. 


9 


309 


15 


AAR48735 


16 


420.5 


19. 


9 


309 


17 


AAW02707 



Description 



Human 17723 recept 
Human G- protein co 
Dendritic cell (DC 
Human secreted pro 
Human receptor mol 
Dendritic cell (DC 
Novel human secret 
Human endometrium 
Human secreted pro 
Bovine rhodopsin. 
Rhodopsin. Homo s 
Human rhodopsin re 
Rhodopsin protein. 
Rhodopsin amino ac 
G-protein coupled 
G-protein coupled 



RESULT 1 
AAB12827 

ID AAB12827 standard; Protein; 402 AA 
XX 

AC AAB12827; 
XX 

DT 05-DEC-2000 (first entry) 
XX 

Human 17723 receptor protein SEQ ID NO:l. 



DE 
XX 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
XX 
PA 
XX 
PI 
XX 
DR 
DR 
XX 



Human; 17723 receptor protein; chromosome lq42-44; diagnosis; vaccine- 
G-protein coupled receptor; gene therapy. 

Homo sapiens. 

WO200043513-A1. 

27-JUL-2000. 

21-JAN-2000; 2000WO-US01592 . 

21-JAN-1999; 99US-0234923 . 

(MILL- ) MILLENNIUM PHARM INC. 

Glucksmann MA; 

WPI; 2000-476196/41. 
N-PSDB; AAA73212. 



II ^ G -P rotein - cou P 1 ed receptor designated 17723 and the nucleic acids 

11 TzTnr^T USSfUl f ° r preventin 9' diagnosing and treating disorder 

PT associated with inappropriate expression of 17723 receptors - 

PS Claim 1; Page 70-72; 79pp ; English. 

CC The present sequence is the human 17723 receptor protein (I), which 



belongs to the superfamily of G-protein-coupled receptors. (I) and the 
polynucleotide encoding it may be used in the prevention, treatment and 
diagnosis of diseases associated with inappropriate 17723 receptor 
expression. They may also be used to study the expression and function 
of 17723 receptor polypeptides and their role in metabolism. The 17723 
receptor polypeptides may be used as antigens in the production of 
antibodies against 17723 receptors and in assays to identify modulators 
(agonists and antagonists) of 17723 receptor expression and activity 
The anti-17723 receptor antibodies and 17723 receptor antagonists may be 
used to down regulate 17723 receptor expression and activity. The 
anti-17723 receptor antibodies may also be used as diagnostic agents for 
detecting the presence of 17723 .receptor polypeptides in samples 
{e.g. by enzyme linked immunosorbent assay (ELISA) ) . The 17723 receptor 
protein has been mapped to chromosome lq42-44. 



CC 

cc 

CC 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

XX 

SQ Sequence 402 AA 



Query Match 100.0%; Score 2117; DB 21; Length 402- 

Best Local Similarity 100.0%; Pred. No. 6.3e-222; 

Matches 402; Conservative 0 ; Mismatches 0;' Indels 0- Gaps 0 



Qy 


1 


Db 


l 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 



MYSGNRSGGHGYWDGGGAAGAEGPAPAGTLSPAPLFSPGTYERLALLLGSIGLLGVGNNL 60 

I'NMIIIIIIIilllllllllllMIIIIIIIIIIIIIIIIIMllliiiMiiiiii 

mysgnrsgghgywdgggaagaegpapagtlspaplfspgtyerlalllgsigllgvgnnl 60 
^y^yT KFQRLRTPTHLLLWISLSDLLVSLFGOTFT ^SCLRNGVmroWGCVWDGFS 120 

1 1 NIIIIIIIIIIIMIIIIIINIIIIMIIIIIIIIIIIIIIIIIIIIllllii 

lvlvlyykfqrlrtpthlllvnislsdllvslfgvtf tfvsclrngwvwdtvgcvwdgf a i ™ 
?f^f?fy?^T^T^ YERYIRW ^ VINFSWAWRAITYIWLY SLAWAGAPLLGWNRYI 

'I ' ' 1 ! i M f i ii iiimiiiimiiimiiiiiiiimiiimimim, 

gslfgivsiatltvlayeryirwharvinfswawraityiwlyslawagapllgwnryi 
LDVHGLGCTVDWKSKDANDSSFVLFLFLGCLWPLGVIAHCYGHILYSIRMLRCVEDLOT 

I'i 1 HII'I'IIII'IIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIII 

ldvhglgctvdwkskdandssfvlflflgclvvplgviahcyghilysimlrcvedlqt 
IQVIKILKYEKKLAKMCFLMIFTFLVCWMPYIVICFLWNGHGHLVTPTISIVSYLFAKS 

jiMniiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

iqvikilkyekklakmcflmiftflvcvmpyivicflvvnghghlvtptisivsylfaks 

^yyrvfyf yyT^y R ^ RR ^^^^^ R ^ R ^^^^^^^^^^Q^^^^^wsQKDGDRp 
'I' ' 1 1 1 i 'ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

ntvynpviyvfmirkfrrsllqllclrllrcqrpakdlpaagsemqirpivmsqkdgdrp 
KKKVTFNSSS 1 1 FI ITSDESLSVDDSDKTNGSKVDVIQVRPL 402 

IIIIIIINMIMIIIIIIIIIIIIIIMIIIIIIIIIIII 



180 



240 



300 



360 



RESULT 2 
AAY98008 

ID AAY98008 standard; Protein- 402 AA 
XX 

AC AAY98008; 
XX 

DT 31-AUG-2000 (first entry) 



XX 
DE 
XX 



Human G-protein coupled receptor, HG51 . 



KW Human; G-protein coupled receptor; HG51; signal transduction; 
kw rhodopsm receptor; obesity; type II diabetes- 

inflammatory bowel disease; constipation; diarrhoea; gene therapy. 



KW 
KW 
XX 

OS Homo sapiens. 
XX 

PN WO200031108-A1. 
XX 

PD 02-JUN-2000. 
XX 



PF 18-NOV-1999; 99WO-US27305 . 
XX 

PR 24-NOV-1998; 98US-0109717 . 
XX 

PA (MERI ) MERCK & CO INC. 
XX 

PI Liu Q, McDonald TP; 
XX 

DR WPI; 2000-400025/34. 

DR N-PSDB; AAA38861. 



XX 
PT 
PT 



New. DNA encoding human HG5-1 (a G-protein coupled receptor) , useful in 

chromosomal mapping studies for identifying the chromosomal locations 
PT of the HG51 gene(s) - 
XX 



PS 
XX 

cc 
cc 



Claim 23; Fig 2; 68pp ; English. 



G protein-coupled receptors (GPCR) are important in signal transduction 
from the exterior to the interior of cells. Rhodopsin receptors are a 

CC type of GPCR which comprise a chromophore -binding pocket which is 

CC covalently linked by a protonated Schiff base to a Lys residue in 

CC transmembrane domain 7. The present sequence is the human HG51 GPCR and 

CC is a member of the rhodopsin receptor family of GPCRs. Due to the Lys 

CC residue and Schiff base present in HG51, it is thought that the HG51 

CC ligand may be a fatty-acid-like molecule. It is also believed that 

CC agonists and antagonists of HG51 are useful for treating various 

CC disorders such as obesity, type II diabetes, inflammatory bowel disease 

CC constipation or diarrhoea. In addition, the coding sequence for the 

CC present sequence may be used in gene therapy for the above mentioned 

CC disorders . 



XX 

SQ Sequence 4 02 AA; 



Query Match 100.0%; Score 2117; DB 21; Length 402- 

Best Local Similarity 100.0%; Pred. No. 6.3e-222; 

)2; Conservative 0; Mismatches 0;' Indels 0; Gaps 



Matches 


Qy 


i 


Db 


i 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 



MIIIIIIMIIIIMIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIlllll 

mysgnrsgghgywdgggaagaegpapagtlspaplfspgtyerlalllgsigllgvgnnl 
LVLVLYYKFQRLRTPTHLLLVNISLSDLLVSLFGVTFTFVSCLRNGWVWDTVGCVWDGFS 

IIIIIIIININIIIMMIIMIIIIilllllllllllllllliMIIIIIIMIIII 

lvlvlyykfqrlrtpthlllvnislsdllvslfgvtftfvsclrngwvwdtvgcvwdgfs 
G S L FG I VS I ATLT VLA YER YI RWHAR V I N FS WAWRA I T Y I WL YS LAWAG A PLLGWNR Y I 

IiiJi!ii!iii! l J IIMI ! lll 1 [ lll ! | IIIIIIMIIIIMIII|| mil 

gsltgivsiatltvlayeryirwharvmfswawraityiwlyslawagapllgwnryi 
LDVHGLGCTVDWKSKDANDSSFVLFLFLGCLVVPLGVIAHCYGHILYSIRMLRCVEDLQT 

i^N 1 ^^ 1 ! 1 '' 1111111111111111111 !!!!!!!!!!!!!!!!!!!!!!!!! 

ldvhglgctvdwkskdandssfvlflflgclvvplgviahcyghilysirmlrcvedl 
I Q V I K I LK YEKKLAKMC FLM I FTF L VCWMP Y I VIC FL WNGHGHL VT PT I S I VS YLFAKS 

INMMIilMllillllMII MllllllllimilllllllllMIIMl 

lqvikilkyekklakmcflmiftflvcwmpyivicflwnghghlvtptisivsylfaks 
NTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRPAKDLPAAGSEMQIRPIVMSQKDGDRP 

'I IIIIMMMIIIIIIIIIIIIIIIIIIIIIIIMllllliiiiiiiiiiiii 

ntvynpviyvfmirkfrrsllqllclrllrcqrpakdlpaagse 
KKKVTFNS SSIIFII TSDESLS VDDSDKTNGSKVD VIQVRPL 4 02 

iiiiHiMmmimiiiiMiiiiiiiiiMiiiiii 



60 
120 



240 



300 



360 



RESULT 3 
AAE12070 

ID AAE12070 standard; Protein; 402 AA 
XX 



AC AAE12070; 
XX 

DT 18-DEC-2001 (first entry) 
XX 

DE Dendritic cell (DC) DCEPR protein. 
XX 

KW Dendritic cell; DC; DCEPR protein; gene therapy; dermatological ; vaccine 

KW atopic dermatitis; autoimmune disease; inflammatory skin disease; cancer 

KW immunosuppressive; AIDS; Acquired immune deficiency syndrome; cytostatic 

KW chromosomal identification; pharmaceutical; hypersensitivity; virucide; 

KW transplant rejection; chronic inflammatory disease; anti-HIV. 

OS Unidentified. 
XX 

PN WO200172773-A2. 
XX 

PD 04-OCT-2001. 
XX 

PF 28-MAR-2001; 2001WO-EP03542 . 
XX 

PR 29-MAR-2000; 2000US-192934P . 

PR 18-MAY-2000; 2000US-205020P . 

PR 18-MAY-2000; 2000US-205026P . 

PR 19-MAY-2000; 2000US-205767P . 

PR 19-MAY-2000; 2000US-205769P . 
XX 

PA (NOVS ) NOVARTIS AG. 

PA (NOVS ) NOVARTIS -ERFINDUNGEN VERW GES MBH 
XX 

PI Werner G, Phares W, Jaritz M, Lapp H, Kalthoff FS; 
XX 

DR WPI; 2001-616466/71. 

DR N-PSDB; AAD19720. 
XX 

PT New polypeptides for screening therapeutic agonists and antagonists 

PT comprise dendritic cell polypeptides - 

PS. Claim 2; Page 42; 52pp ; English. 
XX 

CC The invention relates to dendritic cell. (DC) proteins and their 
CC corresponding DNA molecules. A pharmaceutical composition comprising 
CC agonist and antagonist of DC proteins are useful for treating abnormal 
CC conditions related to both an excess of and insufficient level of 
CC expression of DC gene, or related to both an excess of and insufficient 
CC activity of DC protein. Soluble form of DC proteins are used as an active 
ingredient in combination with pharmaceutical acceptable carriers 
DC genes and proteins are useful for treating chronic inflammatory 
CC diseases, autoimmune diseases, transplant rejection crisis, including 
CC inflammatory skin diseases such as contact hypersensitivity, atopic 
CC dermatitis or virally- induced immune suppression such as AIDS and cancer 
DC protein is useful for inducing immunological response in a mammal and 
as immunogen to produce antibodies immunospecif ic for the polypeptide 
DC gene is useful in gene therapy. DC gene is also useful as a diagnostic 
rr T a T n l' and * or , ch ^osomal identification. The present sequence is 
CC nfT'^ (DC) DCEPR protein which is found to belong to the family 

CC of G-protem coupled receptor protein. 
XX 

SQ Sequence 4 02 AA; 



CC 
CC 



CC 
CC 
CC 
CC 



Query Match 99.4%; Score 2105; DB 22 

Best Local Similarity 99.5%; Pred. No. 1.3e-220 
Matches 400; Conservative 1; Mismatches 1 



Length 402; 
Indels 0; Gaps 



Qy 
Db 



1 MYSGNRSGGHGYWDGGGAAGAEGPAPAGTLSPAPLFSP 60 
1 'NlllllllllllllllllhllllllllllllllllllllMIIIIIIIIIIIIIIII 
1 mysgnrsgghgywdgggaagakgpapagtlspaplfspgtyerlalllgsigll^gAnl 60 



QY 61 LVL VL YYKFQRLRT PTHLLL VNI S LSDLL VS LFG VT FT F VS C LRNGWVWDTVG CVWDG F S 



Db 



fi ' 1 ] 1 J 1 'in 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 mm mi i ii ii i Tim TiTm i ill 1 1 TT II 120 

61 Ivlvlyykfqrlrtpthlllvnis^ 120 



Qy 121 GSLFGIVSIATLTVLAYERYIRWHARVINFSWAWRAITYIWLYSLAWAGAPLLGWNRYI 180 

nK 101 IIIINinilMIIIIIIIIIIIIIIIIIIIIIIIMIMIIIIIIIMIMIMIIII 

121 9 slf 9ivsiatltvlayeryirvvharviiifswawraityiwlyslawagapllgwnryi 180 

Qy 181 LD VHGLGCTVDWKS KDAND S S F VLFLF LGCL WP LG VI AHC YGH I L YS I RMLRC VEDLQT 240 

nh i«! II^N'^^'i 1 !! 11111111111111111111111111 '!!!!!!!!!!!!!!!!! 

Db 181 lavhglgctvdwkskdandssfvlflflgclwplgviahcyghilysirmlrcvedlqt 240 

Qy 241 IQVI KI LKYEKKLAKMCFLMI FTFLVCWMPYI VICFLWNGHGHLVTPTI S I VS YLFAKS 300 

Db ^ I'MIIIIMMIIIIIMIMIIIIIIIIMMIIIIIIIIIIIII IMIII 

241 1 ^ lkl lkyekklakmcflmiftflvcwmpyivicflvvnghghlvtptisivsylfaks 300 

Qy 301 NTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRPAKDLPAAGSEMQIRPIVMSQKDGDRP 360 

Db 30 ''''''HiNMNiMim 

Db 301 ntvynpviyvfmirkfrrsllqllclrllrcqrpakdlpaagsemqirpivmsqkdgdrp 360 

Qy 361 KKKVTFNSSSIIFIITSDESLSVDDSDKTNGSKVDVIQVRPL 402 

nh w Ul'li" 11 !!'! I II I I II II II II I I II H II I I II I I 

Db 361 kkkvtfnsssnfigtsdeslsvddsdktngskvdviqvrpl 402 

RESULT 4 
AAB64743 

ID AAB64743 standard; Protein; 199 AA. 
XX 

AC AAB64 743; 
XX 

DT 23-MAR-2001 (first entry) 
XX 
DE 
XX 
KW 
KW 



Human secreted protein sequence encoded by gene 15 SEQ ID NO: 137. 

Human; secreted protein; diagnosis; cytostatic; antirheumatic; 
antiarthritic; dermalogical; cardiant; antiinflammatory; anti-ulcer; 



KW gastrointestinal; solid tumour; rheumatoid arthritis; psoriasis; 



- 1 _ - ' ^""ivwj., i.iicuiuai.uiu ditnncis; psoriasis 

diabetic retinopathy; myocardial angiogenesis ; Crohn's disease; 
ulcer. 



KW 

KW ulcer 
XX 

OS Homo sapiens. 
XX 

PN WO200077237-A1. 
XX 

PD 21-DEC-2000. 
XX 

PF 01-JUN-2000; 2000WO-US14928 . 
XX 

PR ll-JUN-1999; 99US- 0138633 . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 

PA (ROSE/) ROSEN C A. 

XX 

PI Rosen CA, Ruben SM, Komatsoulis GA- 
XX 

DR WPI; 2001-071280/08. 
XX 
PT 
PT 
PT 
XX 
PS 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 



Nucleic acids encoding 4 9 human secreted polypeptides, useful for 
preventing, diagnosing and/or treating diseases such as tumors, 
rheumatoid arthritis, psoriasis and diabetic retinopathy - 



Disclosure; Page 503; 520pp ; English. 



The polynucleotide sequences given in AAF33037 to AAF33085 encode the 
human secreted proteins given in AAB64666 to AAB64714. AAB64715 to 
AAB64771 represent human secreted polypeptide sequences and proteins 
homologous to them, which are given in the exemplification of the present 
invention. Human secreted proteins have activities based on the tissues 
and cells the genes are expressed in. Examples of activities include- 
cytostatic; antirheumatic; antiarthritic; dermalogical; cardiant- 
antiinflammatory; gastrointestinal; and anti-ulcer. The polynucleotides 
and polypeptides can be used in the prevention, treatment and diagnosis 
of diseases associated with inappropriate polypeptide expression. 



CC Disorders that may be treated or prevented include solid tumours, 
CC rheumatoid arthritis, psoriasis, diabetic retinopathy, myocardial 



CC 
CC 
CC 
CC 
CC 
CC 



CC 
CC 
CC 



angiogenesis, Crohn's disease and ulcers. The polynucleotides and their 
complementary sequences may also be used as DNA probes in diagnostic 
assays (e.g. polymerase chain reactions (PCR) ) to detect and guantitate 
the presence of similar nucleic acid sequences in samples, and therefore 
which patients may be in need of restorative therapy. The polypeptides 
may also be used as antigens in the production of antibodies against the 
CC polypeptide and in assays to identify modulators (agonists and 
^ antagonists) of polypeptide expression and activity. The ant i -polypeptide 
antibodies and antagonists may also be used to down regulate expression 
and activity. AAF33028 to AAF33036 and AAB64665 represent sequences used 
CC m the exemplification of the present invention. 
XX 

SQ Sequence 199 AA; 

Query Match 50.2%; Score 1063; DB 22; Length 199; 

Best Local Similarity 100.0%; Pred. No. 2e-107; 

Matches 199; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 118 GFSGSLFGIVSIATLTVIAYERYIRVVHARVINFSWAWRAITYIWLYSLAWAGAPLLGWN 177 

, NNIIIMIIMIIIIIIIIlllllliiiiiMiiiiiiiiiiiiiiiiiiiiiiiiii 

1 gfsgslfgivsiatltvlayeryirwharvinfswawraityiwlyslawagapllgwn 60 
RYILDVHGLGCTVDWKSKDANDSSFVLFLFLGCLWPLGVIAHCYGHILYSIRMLRCVED 23 7 

Mjiiimiiiimim 

61 ryildvhglgctvdwkskdandssfvlflflgclwplgviahcyghilysirmlrcved 120 
Qy 238 LQTIQVIKILKYEKKLAKMCFLMIFTFLVCWMPYIVICFLWNGHGHLVTPTISIVSYLF 297 

121 lii!i'!iNi ll iiJ l i ll iJ | !iiiiiiiii (iiiiiiimiiiiiiiiiii 

121 IqtiqvikilkyeJcklakmcflmiftflvcwmpyivicflwnghghlvtptisivsylf 180 
Qy 298 AKSNTVYNPVIYVFMIRKF 316 

IIIIIIIIIMIIIIIIII 

Db 181 aksntvynpviyvfmirkf 199 



Db 

Qy 178 
Db 

Qy 

Db 



Result 
No. 



Score 



% 

Query 

Match Length DB 



SUMMARIES 



ID 



Description 



1 


451 


21 


.3 


348 


2 


US-08-390-000A-8 


2 


444 


21 


.0 


348 


4 


US-08-430-286A-11 


3 


420.5 


19 


.9 


309 


1- 


US-08-118-270-56 


4 


420.5 


19 


.9 


309 


5 


PCT-US93-08528-56 


5 


354 .5 


16 


.7 


297 


1 


US-08-118-270-58 


6 


354 .5 


16 


.7 


297 


5 


PCT-US93-08528-58 


7 


341.5 


16 


1 


305 


1 


US-08-118-270-59 


8 


341.5 


16 


1 


305 


5 


PCT-US93-08528-59 


9 


338.5 


16 


0 


297 


1 


US-08-118-270-57 


10 


338.5 


16 


0 


297 


5 


PCT-US93-08528-57 


11 


309 


14 . 


6 


391 


1 


US-07-816-283-2 


12 


309 


14. 


6 


391 


1 


US-08-417-103-2 


13 


309 


14. 


6 


391 


1 


US-08-417-103-14 


14 


304 


14 . 


4 


391 


1 


US-07-816-283-4 


15 


304 


14. 


4 


391 


1 


US-08-417-103-4 



Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 



8, Appli 
11, Appl 
56, Appl 

56, Appl 
58, Appl 

58, Appl 

59, Appl 
59, Appl 

57, Appl 
57, Appl 
2, Appli 
2, Appli 
14, Appl 
4, Appli 
4, Appli 



Lt 

3. 


Score 


Query 
Match 


Length DB 


ID 


1 


477.5 


22.6 


349 


1 


JC5490 


2 


475 


22.4 


351 


1 


A55962 


3 


464 


21.9 


352 


2 


150081 


4 


458 


21.6 


348 


1 


OOBO 


5 


456.5 


21.6 


348 


1 


JC4267 


6 


455 


21.5 


348 


1 


S23398 


7 


452.5 


21.4 


351 


2 


S29152 



Description 



opsin, pineal glan 
opsin, pineal glan 
rhodopsin - green 
rhodopsin - bovine 
opsin - rabbit 
rhodopsin - Chines 
rhodopsin - chicke 



8 


451 


21 


.3 


348 


1 


OOHU 


9 


451 


21 


3 


354 


1 


S27231 


10 


450 


21 


3 


348 


1 


A23665 


11 


448 


21 


2 


354 


1 


151200 



rhodopsin - human 
rhodopsin - nor the 
opsin - mouse 
rhodopsin - Africa 



Result 



Query 



[MO . 


Score 


Match 


Length DB 


ID 


1 


2117 


100 


.0 


402 


1 


0PN3_HUMAN 


2 


1862 


88 


.0 


400 


1 


0PN3_M0USE 


3 


477.5 


22 


.6 


349 


1 


OPSP_COLLI 


4 


475 


22 


.4 


351 


1 


OPSP^CHICK 


5 


468 


22 


1 


352 


1 


OPSD_ALLMI 


6 


466.5 


22 


0 


444 


1 


OPSP_PETMA 


7 


464 


21 


9 


352 


1 


OPSD_ANOCA 


8 


458 


21 


6 


348 


1 


OPSD_BOVIN 


9 


456.5 


21 


6 


348 


1 


OPSD_RABIT 


10 


455 


21. 


5 


348 


1 


OPSD_CRIGR 


11 


455 


21. 


5 


348 


1 


OPSD_MACFA 


12 


455 


21. 


5 


354 


1 


OPSD RANCA 



Description 



Q9hly3 
Q9wuk7 
P51476 
P51475 
P52202 
042490 
P41591 
P02699 
P49912 
P28681 
Q28886 
P51470 



homo sapien 
mus musculu 
columba liv 
gallus gall 
alligator m 
petromyzon 
anolis caro 
bos taurus 
oryctolagus 
cricetulus 
macaca fasc 
rana catesb 



RESULT 1 
0PN3_HUMAN 

ID 0PN3_HUMAN STANDARD; PRT; 402 AA 

AC Q9H1Y3; Q9Y344; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 01-MAR-2002 (Rel. 41, Last annotation update) 

DE Opsin 3 (Encephalopsin) (Panopsin) . 

GN OPN3 OR ECPN. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ■ 
OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
OX NCBI_TaxID=9606; 
RN [l] 

RP SEQUENCE FROM N. A . 

RX MEDLINE=99252448; PubMed=10234000 ; 
RA Blackshaw S., Snyder S.H.; 

RT "Encephalopsin: a novel mammalian extraretinal opsin discretely 

RT localized in the brain."; 

RL J. Neurosci. 19:3681-3690(1999) 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21295039; PubMed=114 01433 ; 

RA Halford S., Freedman M.S., Bellingham J., Inglis S.L 

RA Poopalasundaram S., Soni B.G., Foster R.G., Hunt D M • 

RT "Characterization of a novel human opsin gene with wide tissue 

RT expression and identification of embedded and flanking genes on 

RT chromosome lq43 . " ; 

RL Genomics 72:203-208(2001). 

RN [3] 

RP SEQUENCE FROM N.A. 
RA Parker A. ; 

Submitted (JAN-2001) to the EMBL/ GenBank / DDB J databases 
-'- FUNCTION: May play a role in encephalic photoreception 
SUBCELLULAR LOCATION: Integral membrane protein 

TISSUE SPECIFICITY: Strongly expressed in brain. Highly expressed 
in the preoptic area and paraventricular nucleus of the 
CC hypothalamus. Shows highly patterned expression in other regions 

cc of the brain, being enriched in selected regions of the cerebral 

CC cortex cerebellar Purkinje cells, a subset of striatal neurons, 

CC selected thalamic nuclei, and a subset of interneurons in the 

<-C ventral horn of the spinal cord 

CC OPSl'^FAMlLY 0 ^ ^ ' ° P C0UPLED RECEP ™« • 



RL 
CC 
CC 
CC 
CC 



CC 
CC 
CC 



CC This SWISS - PRO.T entry is copyright. It is produced through T collaboration 
betW6en the SW1SS institute of Bioinformatics and the EMBL outstation 



CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 

cc 

DR EMBL; AF140242; AAD32671.1; -. 

DR EMBL; AF303588; AAK3 7447.1; -. 

DR EMBL; AL133390; CAC19785.1; 

DR InterPro; IPR000276; GPCR_Rhodpsn. 

DR Pfam; PFOO.OOl; 7tm_l; 1. 

DR PRINTS; PR00237; GPCRRHODOPSN . 

DR PROSITE; PS00237; G_PR0TEIN__RECEP_F1_1 ; 1. 

DR PROSITE; PS50262; G_PR0TEIN_RECEP_F1~2 1. 

DR PROSITE; PS00238; OPSIN; 1. 

KW Photoreceptor; Retinal protein; Transmembrane; Lipoprotein; Palmitate; 

KW G-protein coupled receptor. 

FT DOMAIN 1 4 0 EXTRACELLULAR (POTENTIAL) 

FT TRANSMEM 41 65 1 (POTENTIAL) . 

FT DOMAIN 66 77 CYTOPLASMIC (POTENTIAL) . 

FT TRANSMEM 78 102 2 (POTENTIAL) . 

FT DOMAIN 103 117 EXTRACELLULAR (POTENTIAL) . 

FT TRANSMEM 118 137 3 (POTENTIAL) . 

FT DOMAIN 138 153 CYTOPLASMIC (POTENTIAL) . 

FT TRANSMEM 154 177 4 (POTENTIAL) . 

FT DOMAIN 178 201 EXTRACELLULAR (POTENTIAL) 

FT TRANSMEM 202 229 5 (POTENTIAL) . 

FT DOMAIN 230 255 CYTOPLASMIC (POTENTIAL) 

FT TRANSMEM 256 279 6 (POTENTIAL) . 

FT DOMAIN 280 287 EXTRACELLULAR (POTENTIAL) 

FT TRANSMEM 288 312 7 (POTENTIAL). 

FT DOMAIN 313 402 CYTOPLASMIC (POTENTIAL) 

FT DISULFID 114 188 • BY SIMILARITY. 

FT BINDING 299 299 RETINAL CHROMOPHORE. 

FT . LIPID 325 325 PALMITATE (BY SIMILARITY) . 

FT CARBOHYD 5 5 N- LINKED (GLCNAC . . .) (POTENTIAL) 

FT CARBOHYD 198 198 N-LINKED (GLCNAC. . .) (POTENTIAL) 

FT CONFLICT 390 396 NGSKVDV -> IGVQSLML (IN REF 1) 

SQ SEQUENCE 402 AA; 44873 MW; 370F64C19F834A71 -CRC64; 

Query^Match^ ^ 100.0%; Score 2117; DB 1; Length 402; 

""; Pred. No. le-136; 

0; Mismatches 0; Indels 0; Gaps 0; 

60 * 



Matches 


QY 


i 


Db 


i 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 



MYSGNRSGGHGYWDGGGAAGAEGPAPAGTLSPAPLFS PGTYERLALLLGS IGLLGVGNNL 

IIIIIIIIIIIIIIIIIMIMIIIIIIIMIIIIMlllliMMiiiiiiiiiiiiii 

MYSGNRSGGHGYWDGGGAAGAEGPAPAGTLSPAPLFS PGTYERLALLLGS IGLLGVGNNL 



60 
120 



^y^yy KFQRLRTPTHLLL ™ is ^ 

J JIIIIIIIIIIMIIIIIIIIIIIIIIIIIlllMlliiiMiiiiiMiiiiiii 

L VL VL YYKFQRLRT PTHLLLVNI S L S D LLVS L FG VTFT F VS CLRNGWVWDT VG C VWDG F S 120 



IIIIIIIIIIIIIIIIIIIMIIMIIIIIIIIIIMlMiiiMiiiiiiiiiiiiiii 

GSLFGIVSIATLTVLAYERYIRVVHARVINFSWAWRAITYIWLYS i 80 
^Y^^?? TTOWKSra ^ SSFVLFLF ^ CLWP ^ VIAH CYGHILYSIRMLRCVEDLQT 240 

^IIMIIIIIIIIIIIIIIIIIMliiiiiiMiiiiiiiiiiiiiiiiin,!,!^! 

LD VHGLGCTVDWKS KD AND S S F VL FLFLGCL W P LG VI AHC YGH I L YS I RMLRC VED LQT 24 0 
"f ? Y y ^ LKYE KKLAKMC F LM I FT F L VCWM P YI VI C F L WNGHGHL VT PT I S I VS YL FAKS 300 

iiiiiiiiiiiiiiiiiiiiiiiiiiMiiiiiiiiiiiiiiniiiiiiiiniiin, 

I Q VI KI L K YE KKLAKMC FLM I FT F L VCWM P YI VI C FLWNGHGHL VT P T I S I VS YL F AKS 300 
NTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRPAKDLPAAGSEMQIRPIVMSQKDGDRP 360 

'MIMIIIIIIIIIIIMIIIlllliiiiiiMlllllllllllllMiiiMiiiiii 

NTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRPAKDLPAAGSEMQIRPIVMSQ^GDRP 360 
KKKVTFNS SSIIFIITSDES LS VDD S DKTNGS KVD VI Q VRPL 402 

HIIMIIIIIIIIIIIIIIIMIIIIIIIIIMIMIIIII 



Db 361 



KKKVTFNSSSIIFIITSDESLSVDDSDKTNGSKVDVIQVRPL 402 



Rssu.lt 




Query 








No 


Score 


Match 


Length DB 


ID 


2_ 


DUU . D 


23.6 


352 


13 


Q9W6K3 


2 


491.5 


23.2 


534 


13 


057422 


3 


484.5 


22.9 


346 


13 


Q9PUA9 


4 


480 


22.7 


357 


13 


Q9IBH2 


5 


473.5 


22.4 


377 


13 


Q9IB88 


6 


473 


22.3 


543 


13 


Q90YK6 


7 


458 


21.6 


348 


6 


Q9 5 KU1 


8 


457.5 


21.6 


351 


13 


Q9IA36 


9 


455.5 


21.5 


351 


13 


Q9W6S0 


10 


455 


21.5 


363 


13 


Q98TH3 


11 


453.5 


21.4 


322 


13 


057448 



Description 



Q9w6k3 anolis caro 
057422 xenopus lae 
Q9pua9 bufo japoni 
Q9ibh2 phelsuma ma 
Q9ib88 brachydanio 
Q90yk6 gallus gall 
Q95kul felis silve 
Q9ia36 poephila gu 
Q9w6s0 columba liv 
Q98th3 cynops pyrr 
057448 anas platyr 



09831765Results 
SEQ ID NO: 1 

Result 



SUMMARIES 



% 

Query 





No. 


Score 


Match 


Length DB 


ID 




1 


1470.4 


95 


.7 


2533 


9 


AF303588 




2 


1380.8 


89 


,8 


2110 


9 


AF140242 




3 


943 .6 


61 


.4 


1737 


10 


AF140241 


c 


4 


515.4 


33 


.5 


71872 


9 


AL133390 


c 


5 


424 


27 


6 


5024 


6 


AX281720 


c 


6 


422.4 


27 


5 


5000 


6 


A68713 


c 


7 


422.4 


27 


5 


5000 


6 


AR106624 


c 


8 


422.4 


27 


5 


5000 


9 


AF056032 




9 


400.4 


26 


1 


449 


6 


AX013137 




10 


358 


23 


3 


512 


6 


AX062411 


c 


11 


317.2 


20 


6 


8303 


6 


AX345325 




12 


299.4 


19 


5 


8303 


6 


AX345324 



Description 



AF303588 Homo sapi 
AF14 0242 Homo sapi 

AF140241 Mus muscu 
AL1333 90 Human DNA 
AX281720 Sequence 
A68713 Sequence 5 
AR106624 Sequence 
AF056032 Homo sapi 
AX013137 Sequence 
AX062411 Sequence 
AX345325 Sequence 
AX34 5324 Sequence 



RESULT 1 

AF303588 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 



JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 

FEATURES 

source 



gene 
CDS 



AF303588 2533 bp mRNA linear 

Homo sapiens panopsin (OPN3) mRNA, complete cds. 
AF303588 

AF303588.1 GI : 13649589 



PRI 17-APR-2001 



human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 2533) 

Halford,S. # Freedman,M.S. , Bellingham, J. , Inglis,S.L., 
Poopalasundaram, S. , Soni,B.G., Foster, R.G. and Hunt, D.M. 
Characterization of a novel human opsin gene with wide tissue 
expression and identification of embedded and flanking genes on 
chromosome lq43 

Genomics 72 (2), 203-208 (2001) 
21295039 

2 (bases 1 to 2533) 

Halford,S., Bellingham, J . , Freedman, M. S . , Inglis,S.L., 
Poopalasundaram, S. , Foster, R. and Hunt, D.M. 
Direct Submission 

Submitted (05-SEP-2000) Molecular Genetics, Institute of 
Ophthalmology, 11-43 Bath Street, London EC1V 9EL, UK 

Location/Qualifiers 

1. .2533 

/organism="Homo sapiens" 

/db_xref="taxon: 9606" 

/chromosome= " 1 " 

/map= ,, lq43" 

1. .2533 

/gene="OPN3 u 

103. .1311 

/gene= ,, OPN3" 

/note= "multiple tissue opsin" 
/codon_start=l 
/product =" panopsin" 
/protein_id="AAK37447.1" 
/db_xref="GI : 13649590" 

/translat ion= 1 ' MY S GNRSGGHG YWDGGG AAGAEG P AP AGT L SPAPLFSPGTYERL 
ALLLGSIGLLGVGNNLLVLVLYYKFQRLRTPTHLLLVNISLSDLLVSLFGVTFTFVSC 
LRNG WVWDT VG C VWDG FSGSLFGIVS I AT LT VLA YE R Y I R WHAR V I N F S WAWRA I T Y 
I WL YS LAWAG AP L LGWNR Y I LD VHG LGCT VD WKS KD AND S S F VLF L FLG C L WPLG V I 
AHC YGH I L YS I RMLRC VE DLQT I Q VI KI LK YE KKLAKMC F LM I FT F L VCWM P Y I V I C F 
LWNGHGHLVTPTISIVSYLFAKSNTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRP 

AKDLPAAGSEMQIRPIVMSQKDGDRPKKKVTFNSSSIIFIITSDESLSVDDSDKTNGS 
KVDVIQVRPL" 




BASE COUNT 640 a 592 c 567 g 734 t 

ORIGIN 



Query Match 95.7%; Score 1470.4; DB 9; Length 2533; 

Best Local Similarity 99.6%; Pred. No. 7.8e-229; 

Matches 1474; Conservative 0; Mismatches 6; Indels 0; Gaps 



Qy 


58 cagagcaggcggcggagccccagccccacccagtgcggagcgcgccgcgagccccgccgc 117 
1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 t 1 1 1 1 1 1 i 1 1 1 1 


Db 


1 ' 1 1 1 f 1 1 ' 1 1 1 ! i i i i i i i i i i i [ i i i i i i i i | f | 1 t 1 1 1 1 1 ( 1 f 1 1 1 1 | 1 1 I 
1 CGGACGCGTGGGCGGAGCCCCAGCCCCACCCAGTGCGGAGCGCGCCGCGAGCCCCGCCGC 60 


Qy 


118 aagctgagcgcctccgcccgccaggcgcgccggcgccgggccatgtactcggggaaccgc 177 
1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 I 1 1 1 1 ! 1 1 1 1 1 f 1 M 1 1 


Db 


1 1 • ' 1 1 1 • 1 ' ' * ' i i i i i i i i i i i i i i i i i i i i i i H i i i i i i i i i i i [ i | | | | | | j | | | 
61 AAGCTGAGCGCCTCCGCCCGCCAGGCGCGCCGGCGCCGGGCCATGTACTCGGGGAACCGC 120 


Qy 


178 agcggcggccacggc tactgggacggcggcggggccgcgggcgc tgaggggccggcgccg 237 
1 1 1 1 1 1 I 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I 1 1 1 1 1 1 1 1 III II II 


Db 


1 1 1 1 1 1 f 1 1 1 1 1 1 [ i i i i i i i i i i i i i i i i i i i i i i i i i i 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 | | | | 
121 AGCGGCGGCC ACGG CTACTGGG ACGG CGGCGGGGCCG CGGGCGCTGAGGGGC CGG CG C CG 180 


Qy 


238 gcggggacactgagccccgcgcccctcttcagccccggcacctacgagcgcctggcgctq 297 
1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 i 1 1 I I 1 II 


Db 


1 1 1 1 1 1 1 1 ' 1 1 1 1 ' I 1 1 1 1 1 1 1 1 U 1 1 1 1 1 I I I I | | f | | | | | | 1 | J J J | | | | | | | j | [ | | 
181 GCGGGGACACTGAGCCCCGCGCCCCTCTTCAGCCCCGGCACCTACGAGCGCCTGGCGCTG 24 0 


Qy 


298 ctgctgggctccattgggctgctgggcgtcggcaacaacctgctggtgctcgtcctctac 357 
1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


11 1 1 1 M 1 1 1 1 1 1 1 1 1 i| | | | | | | 1 1 1 1 1 1 1 1 | | | | | | | | | | j | | i | | | | | 

241 CTGCTGGGCTCCATTGGGCTGCTGGGCGTCGGCAACAACCTGCTGGTGCTCGTCCTCTAC 300 


Qy 


358 tacaagttccagcggctccgcactcccactcacctcctcctggtcaacatcagcctcagc 417 
1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 


Db 


' 11 1 1 ' 1 1 M 1 1 i f 1 1 1 ! 1 1 1 1 1 I I I I I | M | | | | | | | | | 1 1 t 1 1 1 1 | | | 1 | | | | [ | 1 1 1 

301 TACAAGTTCCAGCGGCTCCGCACTCCCACTCACCTCCTCCTGGTCAACATCAGCCTCAGC 360 


Qy 


418 gacctgctggtgtccctcttcggggtcacctttaccttcgtgtcctgcctgagqaacqqc 477 
II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II II II 1 II II 


Db 


1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 I I [ I | | | | | | | | | | | | | | | | | 1 [ 
361 GACCTGCTGGTGTCCCTCTTCGGGGTCACCTTTACCTTCGTGTCCTGCCTGAGGAACGGC 420 


Qy 


478 tgggtgtgggacaccgtgggctgcgtgtgggacgggtttagcggcagcctcttcgggatt 537 
1 1 II 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 I 1 1 II 1 1 1 1 


Db 


' i i i i i i i i i ii i i i i ! i i i I I I I I l l ll l l i l l 1 1 1 1 j | [ | | | | | | | | | | | | | | | | | | | 
421 TGGGTGTGGGACACCGTGGGCTGCGTGTGGGACGGGTTTAGCGGCAGCCTCTTCGGGATT 4 80 


Qy 


538 gtttccattgccaccctaaccgtgctggcctatgaacgttacattcgcgtggtccatqcc 597 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 II 1 1 


Db 


1 1 1 ' 1 f I' 1 1 1 M 1 1 M 1 1 II 1 1 f 1 1 1 I I I I | | | | | ] | | | | j | | | | j | | | | | | | | | | | | | 

4 81 GTTTCCATTGCCACCCTAACCGTGCTGGCCTATGAACGTTACATTCGCGTGGTCCATGCC 54 0 


Qy 


598 agagtgatcaatttttcctgggcctggagggccattacctacatctggctctactcactg 657 
1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 I 1 1 1 1 1 1 1 1 1 1 1 II 


Db 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ' 1 M 1 1 1 1 I 1 1 1 I I 1 1 1 1 I 1 1 | | | | | | | | | | 1 1 1 1 1 1 I I I I I I I 
541 AGAGTGATCAATTTTTCCTGGGCCTGGAGGGCCATTACCTACATCTGGCTCTACTCACTG 600 


Qy 


658 gcgtgggcaggagcacctctcctgggatggaacaggtacatcctggacgtacacggacta 717 
1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 I 1 1 1 1 1 1 1 1 1 1 II 1 1 1 


Db 


^ J,JJUJUJ,JUiJ, I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 
601 GCGTGGGCAGGAGCACCTCTCCTGGGATGGAACAGGTAC AT CCTGGACGTAC ACGG ACTA 660 


Qy 


718 ggctgcactgtggactggaaatccaaggatgccaacgattcctcctttgtgcttttctta 777 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 I 1 1 1 


Db 


1 1 1 1 1 1 f 1 I'' 1 i 1 1 1 1 1 1 1 1 1 1 M II II 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | || | | 
661 GGCTGCACTGTGGACTGGAAATCCAAGGATGCCAACGATTCCTCCTTTGTGCTTTTCTTA 720 


Qy 


778 tttcttggctgcctggtggtgcccctgggtgtcatagcccattgctatggccatattcta 837 

IIHMMimmMllimilllMIIMIimiMIMUMIIII MINIM 


Db 


721 TTTCTTGGCTGCCTGGTGGTGCCCCTGGGTGTCATAGCCCATTGCTATGGCCATATTCTA 780 


Qy 


838 tattccattcgaatgcttcgttgtgtggaagatcttcagacaattcaagtgatcaagatt 897 


Db 


w iii^JL ' ' 1 1 N 1 1 1 1 1 i 1 1 1 f 1 M 1 1 f 1 II 1 M 1 1 1 1 1 1 If 1 1 1 1 1 1 1 1 1 f II 1 1 M I f 

781 TATTCCATTCGAATGCTTCGTTGTGTGGAAGATCTTCAGACAATTCAAGTGATCAAGATT 840 


Qy 


898 ttaaaatatgaaaagaaactggccaaaatgtgctttttaatgatattcaccttcctggtc 957 

ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 iiiiiiiiii 

841 TT AAAAT ATG AAAAG AAACTGGCC AAAATG TG CTTTTTAATG AT ATTC ACCTTC CTGGT C 900 


Db 


Qy 


958 tgttggatgccttatatcgtgatctgcttcttggtggttaatggtcatggtcacctggtc 1017 


Db 


Qni ii ii hi mm iiiiiiiiii inn in; iimiiinimiiimii 

901 TGTTGG ATG C C TT AT AT CGTG ATCTGCT TC TTGG TGGTT AATGG TCATGGTC ACCTGGT C 960 



Qy 


1018 


i*^v.vv,»a^acn.ui,v/UBLLy n»i,^yLaLi,LLLL L.y L addl, t*y adCaCCQtataCSflt CC3. 


1077 


Db 


961 


IIIIIIIIIIIIIIIIIIIIIIIMIIIIIIMIIIIIIIIIIIIIIIIMIIIIIIIII 

ACTCCAACAATATCTATTGTTTCGTACCTCTTTGCTAAATCGAArArTriTaTaraaTr'r'a 


i ri -5 


Qy 


1078 


gtgatttatQtcttcatQatcaQaaaattfccaaAaafr , rHfhft-oi^arrr'^^r*^r^^^^*-,-. 

> MIHIMIIimillMIIMIIIIIIIIIIIIIMIIIIIllllimii 

GTGATTTATGTCTTCATGATCAGAAAGTTTCGAAGATCCCTTTTarAnrTTrTrTrrrTr 


1137 


Db 


1021 


1080 


Qy 


1138 


CQa.CtCfCtCJdQQtaCCdaaCfCrcchCrrhaaaoaf'rt-apriarir'anr'l-rtrTa =n-r+- i -'r-= 
-3 a ^ ^j u 33 >-3*' l ' a y»yy uy i_a.aa.ycn^^ LdOUdy LayCtQyaayLyoddLQCdQ 

IIIMIIIIIIIIIIIIIIMIMIIIIIIIIIIIIIIIMIIIIIIIIIIIllllllil 


1197 


Db 


1081 


CGACTGCTGAGGTGCCAGAGGCCTGCTAAAGACCTACCAGCAGrTfifiAAfiTrAAATrrar 


1140 


Qy 


1198 


dtCdQSCCCdttQtQdtCrtcaraCfaaacrat'afirfnanarTnr'paaartaa a = = oj-¥+-« = j-i+*4-+-« 
<~vi*-2Mv*^wui.i.v3 ^y»^y^^a^ayaaayatyyyydLayyuuaadydaaddoytyoCCttC 

IMIIIIIIIIIIIIIIMMIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIlllMi 


1257 


Db 


1141 


ATCAGACCCATTGTGATGTCACAGAAAGATGGGGAr AGGCPA aanaaaaanrTrn ptttp 


1200 


Qy 


1258 


^- *- »- ^v^L,k_«w^a.L.v_cn_ ll LLaLUdLLa^LdytydtyadtCdCUyLCdyLLQdCgaCdQC 

II IIIIIMIIIMIIII Illllllllllllllllllllllllllll! 


1317 


Db 


1201 


AACTCTTCTTCCATCATTTTTATCATCACCAGTGATGAATCT^ClTGTrAflTTfiArrararT' 


IZbV 


Qy 


1318 


CfdCadddCCddtcraatCCaaaatfaat"Ol"aal _ r i raafTH-prtt-r'^<-t-fi-ft- ^ T/ -_-^ _ j_ _ _ 
^w.wu.wwu^wu.w.k.^j^vj L.v-i-aciay ttyaLy uddL^Lddy ttCytCCtLtyLdyyddLQddQd 

II MIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIllHll 


1377 


Db 


1261 


GACAAAACCAATGGGTCCAAAGTTGATGTAATCCAAGTTrGTr PTTTflT AP r a atp a an a 


1320 


Qy 


1378 


a.t QQcaaccraa.SQ a.t aQaacn t - i~ ,3 ^ p t~ t~ rro^ i-rTr , r , ar-t-t-t-t- nn^ 1- /-« t> +-/-.-> 4 — ._,„,_ 

IMIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIMIIIIIIIMI 


1437 


Db 


1321 


ATGGCAACGAAAGATGGGGCCTTAAATTGGATGCCACTTTTGGACTTTCATCATAAGAAG 


1380 


Qy 


1438 


tgtctggaatacccgttctatgtaatatcaacagaaccttgtggtccagcaggaaatccq 


1497 


Db 


1381 


IIINIMIIMIIIIIIIIIII 1 millllllllllllllllllllll 

TGTCTGGAATACCCGTTCTATGTAATATCAACAGAACCTTGTGGTCCAGCAGGAAATCCG 


1440 


Qy 


1498 


aattgcccatatgctcttgggcctcaggaagaggttgaac 1537 

IMIIIIIIIIIIIIIIIIIIIIIIIIIIMlliiiMii 

AATTGCCCATATGCTCTTGGGCCTCAGGAAGAGGTTGAAC 1480 




Db 


1441 





RESULT 2 
AF140242 

LOCUS AF140242 2110 bp mRNA linear PRI 27-MAY-1999 

DEFINITION Homo sapiens encephalopsin mRNA, complete cds . 

ACCESSION AF140242 

VERSION AF140242.1 GI: 4894951 

KEYWORDS 

SOURCE human . 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo 
REFERENCE 1 (bases 1 to 2110) 

AUTHORS Blackshaw,S. and Snyder, S.H. 

TITLE Encephalopsin: A novel mammalian extraretinal opsin discretely 

localized in the brain 
JOURNAL J. Neurosci. 19 (10), 3681-3690 (1999) 
MEDLINE 99252448 
PUBMED 10234000 
REFERENCE 2 (bases 1 to 2110) 

AUTHORS Blackshaw,S. and Snyder, S.H. 
TITLE Direct Submission 

JOURNAL Submitted (02-APR-1999) Genetics, Harvard Medical School, 200 
Longwood Ave., Boston, MA 02115, USA 
FEATURES Location/Qualifiers 
source l. .2110 

/organism="Homo sapiens" 
/db_xre f = " t axon : 9 6 0 6 " 
CDS 56. .1267 

/note=" extraretinal opsin" 

/codon_start=l 

/ product = " encephalops in " 

/protein_id="AAD32671 .1" 

/db xref="GI : 4894952" 



/ 1 ran s 1 at i on= 11 MYSGNRSGGHG YWDGGGAAGAEGPAPAGTLS PAPLFS PGT YERL 
ALLLGSIGLLGVGNNLLVLVLYYKFQRLRTPTHLLLVNISLSDLLVSLFGVTFTFVSC 
LRNGWVWDT VG C VWDGF S GS L FG I VS I AT LTVLA YER YI R WHAR V I NF S WAWRA I T Y 
IWLYSIAWAGAPLLGWNRYILDVHGLGCTVDWKSKDANDSSFVLFLFLGCLVVPLGVI 
AHCYGHILYSIRMLRCVEDLQTIQVIKILKYEKKLAKMCFLMIFTFLVCWMPYIVICF 
LWNGHGHLVTPTISIVSYLFAKSNTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRP 

AKDLPAAGSEMQIRPIVMSQKDGDRPKKKVTFNSSSIIFIITSDESLSVDDSDKTIGV 
QSLMLIQVRPL " 

BASE COUNT 522 a 516 c 480 g 592 t 

ORIGIN 

Query Match 89.8%; Score 1380.8; DB 9; Length 2110; 

Best Local Similarity 98.3%; Pred. No. 2.6e-214; 

Matches 1421; Conservative 0; Mismatches 12; Indels 13; Gaps 2; 



Qy 105 



cgagccccgccgcaagctgagcgcctccgcccgccaggcgcgccggcgccgggccatqt 

Him Illlllllllllllliiiiiii Illlllllliinii 



405 catcagcctcagcgacctgctggtgtccctcttcggggtcacctttaccttcgtgtcctq 464 

, ni '''''I'' I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 

301 CATCAGCCTCAGCGACCTGCTGGTGTCCCTCTTCGGGGTCACCTTTACCTTCGTGTCCTG 360 

????f??f a ?? gctg99tgt99gacaccgtggqct 9 c g t gtgggacgggtttagcggcag 
I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I i i i i i i i i 7 i t 7 



a 164 

1 " i l l l l l l l f l I I j I I | | | | | | | | | | | | | | | | | | l | | f | | ( f | [ | | | | | 

Db 1 CGAGCCCCGCCGCAAGCTGAGCGCCTCCGCCCGCCAGGCGCGCCGGCGCCGGGCCATGTA 60 

Qy 165 ctcggggaaccgcagcggcggccacggctactgggacggcggcggggccgcgggcgctga 224 

^ JIM 1 1 Illlllllllllllliiiiiii 1 1 II II Nil 

Db 61 CTCGGGG AAC CG CAGCGG CGGC C ACGG CT ACTGGGACGGCGGCGGGGCCGCGGGCG CTGA 120 

Qy 225 ggggccggcgccggcggggacactgagccccgcgcccctcttcagccccggcacctacqa 284 

IN 1 1 Illllllllllllllll MINI I Illlllllllllllliiiiiii MINIM I 

Db 121 GGGGCCGGCGCCGGCGGGGACACTGAGCCCCGCGCCCCTCTTCAGCCCCGGCACCTACGA 180 
Qy 285 gcgcctggcgctgctgctgggctccattgggctgctgggcgtcggcaacaacctgctggt 344 

nH 1Q1 MMIMIIIIIIIIIIIIIIIIIIIIIIIMIMIII Illllllllllllllll 

Db 181 GCGCCTGGCGCTGCTGCTGGGCTCCATTGGGCTGCTGGGCGTCGGCAACAACCTGCTGGT 24 0 
Qy 345 gctcgtcctctactacaagttccagcggctccgcactcccactcacctcctcctqqtcaa 404 

nK n 1 1 1 1 1 1 M I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 241 GCTCGTCCTCTACTACAAGTTCCAGCGGCTCCGCACTCCCACTCACCTCCTCCTGGTCAA 300 

Qy 

Db 

Qy 465 ^VV?VV???? 9Ctgg9C9C99gacaccgt999Ctgc 9 t g t ggg a cgg9tttagcggcag 524 

nK „ IMMIIIIIIIIIIIII MIMMIIIIMIIMMIIMIIIIIINIIIMI 

Db 361 CCTGAGGAACGGCTGGGTGTGGGACACCGTGGGCTGCGTGTGGGACGGGTTTAGCGGCAG 420 

Qy 525 cctcttcgggattgtttccattgccaccctaaccgtgctggcctatgaacgttacattcq 584 

I I I I I I I I I I I M I I I I M I I I I I I I I I | | | | | M I I I I I I I I | | I I I I I I I I I I I | | | | 
Db 4 21 CCTCTTCGGGATTGTTTCCATTGCCACCCTAACCGTGCTGGCCTATGAACGTTACATTCG 4 80 

Qy 585 cgtggtccatgccagagtgatcaatttttcctgggcctggagggccattacctacatctg 644 

I I I I I I I I I I I I I I M I I I I I I I I I I | | | | | | | | | | | I I I I I I I M | I I I I I I I I I I I | | 
Db 481 CGTGG TCC ATGCC AG AG TG ATC AATTTTTC CTGGG CCTGG AGGG C C ATTACCT ACATCTG 540 

Qy 645 gctctactcactggcgtgggcaggagcacctctcctgggatggaacaggtacatcctgqa 704 

^ B I I II I III Illllllllllllllll Mil Ml,!: IIIIMIIIIIIIIIIII 

Db 541 GCTCTACTCACTGGCGTGGGCAGGAGCACCTCTCCTGGGATGGAACAGGTACATCCTGGA 600 

Qy 705 cgtacacggactaggctgcactgtggactggaaatccaaggatgccaacgattcctcctt 764 

r , c I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I | I I I I I I I I I I I | | | | | | 

Db 601 CGTACACGGACTAGGCTGCACTGTGGACTGGAAATCCAAGGATGCCAACGATTCCTCCTT 660 

Qy 765 tgtgcttttcttatttcttggctgcctggtggtgcccctgggtgtcatagcccattgcta 824 

nK c IIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIMIMIIIIIII 

Db 661 TGTGCTTTTCTTATTTCTTGGCTGCCTGGTGGTGCCCCTGGGTGTCATAGCCCATTGCTA 720 

Qy 825 tggccatattctatattccattcgaatgcttcgttgtgtggaagatcttcagacaattca 884 

I I I ' I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | || | | | | | | 
Db 721 TGGCCATATTCTATATTCCATTCGAATGCTTCGTTGTGTGGAAGATCTTCAGACAATTCA 780 

Qy 885 



agtgatcaagattttaaaatatgaaaagaaactggccaaaatgtgctttttaatgat 
I f I I I I I I I I ! I I I I I I I I i I I I I I I I I I I I I I I I ! I I I I I Mill I I I I I I I i i i i 



att 944 



nh _ ai ] j, 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 781 AGTGATCAAGATTTTAAAATATGAAAAGAAACTGGCCAAAATGTGCTTTTTAATGATATT 84 0 



Qy 

Db 



945 
841 

Qy 1005 

Db 901 

Qy 1065 

Db 961 

Qy 1125 

Db 1021 

Qy 1185 

Db 1081 

Qy 1245 

Db 1141 

Qy 1305 

Db 1201 

Qy 1362 

Db 1261 

Qy 1422 

Db 1321 

Qy 1472 

Db 1381 

Qy 1532 

Db 1441 



caccttcctggtctgttggatgccttatatcgtgatctgcttcttggtggttaatqqtca 

nillllllllllllllillllMllllilllllMMIIMIIlHlllllllllllll 

CACCTTCCTGGTCTGTTGGATGCCTTATATCGTGATCTGCTTCTTGGTGGTTAATGGTCA 
tggtcacctggtcactccaacaatatctattgtttcgtacctctttgctaaatcqaacac 

MIMIIIIIIIIIIIIIIIIIIIillllMIIMI ' I ! 1 1 ! I ) i j 1 1 [ f I 

TGGTCACCTGGTCACTCCAACAATATCTATTGTTTCGTACCTCTTTGCTAAATCGAACAC 
^^??f atccagtgatttatgtcttcat 9 atca gaaagtttcgaagatcccttttgca 

NilNIIHIIIIIIIIIMIIIIIIIMM 

TGTATACAATCCAGTGATTTATGTCTTCATGATCAGAAAGTTTCGAAGATCCCTTTTGCA 
gcttctgtgcctccgactgctgaggtgccagaggcctgctaaagacctaccaqcaqctqq 

NIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMiliiiiiiiiiiMIIIIIII 

GCTTCTGTGCCTCCGACTGCTGAGGTGCCAGAGGCCTGCTAAAGACCTACCAGCAGCTGG 

AAGTGAAATGCAGATCAGACCCATTGTGATGTCACAGAAAGATGGGGACAGGCCAAAGAA 

mnfnMnmMnM"iMn?iin;?nMMm?;f"?nfnnfM 

AAAAGTGACTTTCAACTCTTCTTCCATCATTTTTATCATCACCAGTGATGAATCACTGTC 
f ?^?f cqacagcgacaaaaccaatgggtccaaa- - -gttgatgtaatccaagttcgtcc 

IMIIIIIIII I I II I IIIIIMIIMIIIII 

AGTTGACGACAGCGACAAAACCATTGGGGTCCAAAGTTTGATGTTAATCCAAGTTCGTCC 
tttgtaggaatgaagaatggcaacgaaagatggggccttaaattggatgccacttttqqa 

III', In hi' IIMIIIIIMIl IIMIMIIMIIIIIIIIIIIIIMIIII 

TTTGTAGGAATGAAGGATGGCAACGAAAGGTGGGGCCTTAAATTGGATGCCACTTTTGGA 
?^ t r? at T C fr aa 9 aa gtgtctggaatacccgttctatgtaatatcaacag 

lll'HIIII MIIMMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIl 

CTTTCATCATCCTCCTGAAGAAGAAGTGTCTGGAATACCCGTTCTATGTAATATCAACAG 

nTrTTT?fTMTTTTTrTiMrTrTMnTr?TnrrfnTrrTmrrnTTTTTTT? 

AACCTTGTGGTCCAGCAGGAAATCCGAATTGCCCATATGCTCTTGGGCCTCAGGAAGAGG 
ttgaac 1537 

nun 

TTGAAC 144 6 



1004 
900 
1064 
960 
1124 
1020 
1184 
1080 
1244 
1140 
1304 
1200 
1361 
1260 
1421 
1320 
1471 
1380 
1531 
1440 



Result 



% 

Query 



SUMMARIES 





No. 


Score 


Match 


Length DB 


ID 




1 


1537 


100 


.0 


1537 


21 


AAA38861 




2 


1459 


94 


.9 


2144 


21 


AAA73212 




3 


1373.2 


89 


.3 


2037 


22 


AAD19720 




4 


1028 


66 


.9 


1697 


22 


AAF33051 




5 


687.8 


44 


.7 


1267 


21 


AAZ34604 




6 


644 .4 


41 


9 


1763 


21 


AAC69518 




7 


437 


28 


4 


619 


22 


AAD19721 


c 


8 


435 


28 


3 


12291 


22 


AAK79265 


c 


9 


424 


27 


6 


5024 


24 


AAS94874 


c 


10 


422.4 


27 


5 


5000 


19 


AAV20609 




11 


400.4 


26. 


1 


449 


20 


AAZ42057 



Description 



Human G-protein co 
Human 17723 recept 
Dendritic cell (DC 
Human secreted pro 
Human receptor mol 
Human secreted pro 
Dendritic cell (DC 
Human immune/ ha ema 
Human DNA sequence 
Human kynurenine-3 
Human endometrium 



AAA38861 

ID AAA38861 standard; 
XX 

AC AAA38861; 
XX 



CDNA; 1537 BP. 



DT 31-AUG-2000 (first entry) 
XX 

DE Human G-protein coupled receptor, HG51, coding sequence. 
XX 

KW Human; G-protein coupled receptor; HG51; signal transduction; 

KW rhodopsin receptor; obesity; type II diabetes; 

KW inflammatory bowel disease; constipation; diarrhoea; gene therapy; ss. 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT CDS 160.. 1368 

FT /*tag= a 

FT /product = "Human HG51" 

XX 

PN WO200031108-A1. 
XX 

PD 02-JUN-2000. 
XX 

PF 18-NOV-1999; 99WO-US27305 . 
XX 

PR 24-NOV-1998; 98US-0109717 . 
XX 

PA (MERI ) MERCK & CO INC. 
XX 

PI Liu Q, McDonald TP; 
XX 

DR WPI; 2000-400025/34. 
DR P-PSDB; AAY98008. 
XX 

PT New DNA encoding human HG51 (a G-protein coupled receptor) , useful in 
PT chromosomal mapping studies for identifying the chromosomal locations 
PT of the HG51 gene(s) - 
XX 

PS Claim 1; Fig 1; 68pp; English. 
XX 

CC G protein-coupled receptors (GPCR) are important in signal transduction 
CC from the exterior to the interior of cells. Rhodopsin receptors are a 
CC type of GPCR which comprise a chromophore -binding pocket which is 
CC covalently linked by a protonated Schiff base to a Lys residue in 
CC transmembrane domain 7. The present sequence is the coding sequence of 
CC the human HG51 GPCR and is a member of the rhodopsin receptor family of 
CC GPCRs. Due to the Lys residue and Schiff base present in HG51, it is 
CC thought that the HG51 ligand may be a fatty-acid-like molecule. It is 
CC also believed that agonists and antagonists of HG51 are useful for 
CC treating various disorders such as obesity, type II diabetes, 

CC i 1 ^ mm — > +- /-\ -v-T r 1JJ . ■ ,i 

CC 

CC disorders . 
XX 

SQ Sequence 1537 BP; 320 A; 426 C; 421 G; 370 T; 0 other; 



treating various disorders such as obesity, type II diabetes, 
inflammatory bowel disease, constipation or diarrhoea. In addition, the 
present sequence may be used in gene therapy for the above mentioned 

H 1 QOrHfiTC! 



Query Match 100.0%; Score 1537; DB 21; Length 1537; 

Best Local Similarity 100.0%; Pred. No. 0; 



Matches 


1537; Conservative 0; Mismatches 0; Indels 0; Gaps 


Qy 


i 


ggggccacggggggtgcgccggcgcgcggtagcgcgggcccctcagtgcacaatggccag 

IIIIMIIMIIIIIIMIIIIMIIIIIIIlllliiiMiiiiiiiiiiiiiiiiiiii 


60 


Db 


i 


ggggccacggggggtgcgccggcgcgcggtagcgcgggcccctcagtgcacaatggccag 


60 


Qy 


61 


agcaggcggcggagccccagccccacccagtgcggagcgcgccgcgagccccgccgcaag 
1 1 I 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 ! 1 1 1 1 1 1 1 l l l l I I M i ii i i mi iii i ii 


120 


Db 


61 


' I 1 1 1 1 1 1 1 M 1 M 1 1 1 1 | | | | | | | | | | | 

agcaggcggcggagccccagccccacccagtgcggagcgcgccgcgagccccgccgcaag 120 


Qy 


121 


ctgagcgcctccgcccgccaggcgcgccggcgccgggccatgtactcggggaaccqcaqc 

IMillilllllllMIIIMIIilllllllllMIIIIIIIMNIIIIIIIIIIIIII 


180 


Db 


121 


ctgagcgcctccgcccgccaggcgcgccggcgccgggccatgtactcggggaaccgcagc 


180 


Qy 


181 


ggcggccacggctactgggacggcggcggggccgcgggcgctgaggggccggcgccggcg 


240 



Db 181 ggcggccacggctactgggacggcggcggggccgcgggcgctgaggggccggcgccggcg 240 

Qy 241 gggacactgagccccgcgcccctcttcagccccggcacctacgagcgcctggcgctgctg 300 

Db 241 gggacactgagccccgcgcccctcttcagccccggcacctacgagcgcctggcgctgctg 300 

Qy 301 ctgggctccattgggctgctgggcgtcggcaacaacctgctggtgctcgtcctctactac 360 

Db 301 ctgggctccattgggctgctgggcgtcggcaacaacctgctggtgctcgtcctctactac 360 

Qy 361 aagttccagcggctccgcactcccactcacctcctcctggtcaacatcagcctcagcgac 420 

Db 361 aagttccagcggctccgcactcccactcacctcctcctggtcaacatcagcctcagcgac 420 

Qy 421 ctgctggtgtccctcttcggggtcacctttaccttcgtgtcctgcctgaggaacggctgg 480 

Db 421 ctgctggtgtccctcttcggggtcacctttaccttcgtgtcctgcctgaggaacggctgg 480 

Qy 481 gtgtgggacaccgtgggctgcgtgtgggacgggtttagcggcagcctcttcgggattgtt 540 

Db 481 gtgtgggacaccgtgggctgcgtgtgggacgggtttagcggcagcctcttcgggattgtt 540 

Qy 541 tccattgccaccctaaccgtgctggcctatgaacgttacattcgcgtggtccatgccaga 600 

Db 541 tccattgccaccctaaccgtgctggcctatgaacgttacattcgcgtggtccatgccaga 600 

Qy 601 gtgatcaatttttcct^ 660 

Db 601 gtgatcaatttttcctgggcctggagggccattacctacatctggctctactcactggcg 660 

Qy 661 tgggcaggagcacctctcctgggatggaacaggtacatcctggacgtacacggactaggc 720 

Db 661 tgggcaggagcacctctcctgggatggaacaggtacatcctggacgtacacggactaggc 720 

Qy 721 tgcactgtggactggaaatccaaggatgccaacgattcctcctttgtgcttttcttattt 780 

Db 721 tgcactgtggactggaaatccaaggatgccaacgattcctcctttgtgcttttcttattt 780 

Qy 781 cttggctgcctggtggtgcccctgggtgtcatagcccattgctatggccatattctatat 840 

Db 781 cttggctgcctggtggtgcccctgggtgtcatagcccattgctatggccatattctatat 84 0 

Qy 841 j^j^"^ 3 *^ 900 

Db 841 tccattcgaatgcttcgttgtgtggaagatcttcagacaattcaagtgatcaagatttta 900 

Qy 901 aaatatgaaaagaaactggccaaaatgtgctttttaatgatattcaccttcctggtctgt 960 

Db 901 aaatatgaaaagaaactggccaaaatgtgctttttaatgatattcaccttcctggtctgt 960 

Qy 961 !020 

Db 961 tggatgccttatatcgtgatctgcttcttggtggttaatggtcatggtcacctggtcact 1020 

Qy 1021 ccaacaatatctattgtttcgtacctctttgctaaatcgaacactgtatacaatccagtg 1080 

Db 1021 ccaacaatatctattgtttcgtacctctttgctaaatcgaacactgtatacaatccagtg 1080 

Qy 1081 atttatgtcttcatgatcagaaagtttcgaagatcccttttgcagcttctgtgcctccga 1140 

Db 1081 atttatgtcttcatgatcagaaagtttcgaagatcccttttgcagcttctgtgcctccga 1140 

Qy 1141 ^|^| a ^^ a ^^ 1200 

Db 1141 ctgctgaggtgccagaggcctgctaaagacctaccagcagctggaagtgaaatgcagatc 1200 

Qy 1201 ^^j 1 "^ 3 ^ !260 

Db 1201 agacccattgtgatgtcacagaaagatggggacaggccaaagaaaaaagtgactttcaac 1260 

Qy 1261 tcttcttccatcatttttatcatcaccagtgatgaatcactgtcagttgacgacagcgac 1320 



Db 


1261 


Qy 


1321 


Db 


1321 


Qy 


1381 


Db 


1381 


Qy 


1441 


Db 


1441 


Qy 


1501 


Db 


1501 



MimiiMMiimiiimiiiiiiiiiimiiiiMMiiiiimimiiii 

bcttcttccatcatttttatcatcaccagtgatgaatcactgtcagttgacgacagcgac 
aaaaccaatgggtccaaagttgatgtaatccaagttcgtcctttgtaggaatqaaqaatq 

mini iiiiiiiMiiiiiiiiiimiiiii minimi 

aaaaccaatgggtccaaagttgatgtaatccaagttcgtcctttgtaggaatgaagaatg 
jcaacgaaagatggggccttaaattggatgccacttttggactttcatcataagaaqtqt 

IIIIIIIIMMIIIIIIIIIMIIIIIIIMIIIIIIIIMIMIIIIIIIIIIIIIII 

jcaacgaaagatggggccttaaattggatgccacttttggactttcatcataagaagtgt 
:tggaatacccgttctatgtaatatcaacagaaccttgtggtccagcaggaaatccqaat 

imiiiiiiiiiiiiiiiii ii f 1 1 1 1 1 1 1 i 1 1 1 f 1 1 1 1 i r i 

:tggaatacccgttctatgtaatatcaacagaaccttgtggtccagcaggaaatccgaat 
:gcccatatgctcttgggcctcaggaagaggttgaac 1537 

MIMIIIIIIIIIIIIIIIMIIIIIIIIIIIIII 



RESULT 2 
AAA73212 

ID AAA73212 standard; cDNA; 2144 BP. 
XX 

AC AAA73212; 
XX 

DT 05-DEC-2000 (first entry) 
XX 

DE Human 17723 receptor protein encoding cDNA SEQ ID NO: 2 
XX 
KW 
KW 
XX 

OS Homo sapiens . 
XX 

PN WO200043513-A1. 
XX 

PD 27-JUL-2000. 
XX 

PF 21-JAN-2000; 2000WO-US01592 . 
XX 

PR 21-JAN-1999; 99US-0234923 . 
XX 

PA (MILL-) MILLENNIUM PHARM INC. 
XX 

PI Glucksmann MA; 
XX 

DR WPI; 2000-476196/41. 
DR P-PSDB; AAB12827. 
XX 
PT 
PT 
PT 
XX 
PS 
XX 
CC 



500 



CC 
CC 
CC 



Human; 17723 receptor protein; chromosome lq42-44; diagnosis; vaccine; 
G-protein coupled receptor; gene therapy; ss. 



A G-protein- coupled receptor designated 17723 and the nucleic acids 
that encode it, useful for preventing, diagnosing and treating disorder 
associated with inappropriate expression of 17723 receptors - 

Claim 3; Page 72-73; 79pp ; English. 

The present sequence encodes the human 17723 receptor protein (I) , which 
CC belongs to the superfamily of G-protein-coupled receptors, (I) and the 
CC polynucleotide encoding it may be used in the prevention, treatment and 
CC diagnosis of diseases associated with inappropriate 17723 receptor 
CC expression. They may also be used to study the expression and function 
rr of 17723 receptor polypeptides and their role in metabolism. The 17723 
receptor polypeptides may be used as antigens in the production of 
antibodies against 17723 receptors and in assays to identify modulators 
CC (agonists and antagonists) of 17723 receptor expression and activity. 
CC The anti-17723 receptor antibodies and 17723 receptor antagonists may be 
CC used to down regulate 17723 receptor expression and activity. The 
CC anti-17723 receptor antibodies may also be used as diagnostic agents for 
CC detecting the presence of 17723 receptor polypeptides in samples 
CC (e.g. by enzyme linked immunosorbent assay (ELISA) ) . The 17723 receptor 
CC protein has been mapped to chromosome lq42-44. 



XX 

SQ Sequence 2144 BP; 525 A; 531 C; 496 G; 590 T; 2 others- 
Query Match 94.9%; Score 1459; DB 21; Length 2144; 
Best Local Similarity 99.9%; Pred. No. 0; 

Matches 1470; Conservative 0; Mismatches 0; Indels 1; Gaps 
Qy 67 cggcggagccccagccccacccagtgcggagcgcgccgcgagccccgccgcaagctgagc 126 

ii ii in 1 1 ii 1 1 ii ii ii in nullum mum i 

do l cggcggagccccag-cccacccagtgcggagcgcgccgcgagccccgccgcaagctgagc 59 

Qy 127 gcctccgcccgccaggcgcgccggcgccgggccatgtactcggggaaccgcagcqgcqqc 186 

1 1 1 1 1 1 M 1 1 II II 1 1 II 1 1 1 1 1 1 1 1 II II 1 1 1 II II I M II 1 1 1 II II M 1 1 1 1 

60 9 cctcc 9cccgccaggcgcgccggcgccgggccatgtactcggggaaccgcagcggcggc 119 

Qy 187 cacggctactgggacggcggcggggccgcgggcgctgaggggccggcgccggcggggaca 246 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I i | I I I I | | | | | | 

Do 120 cacggctactgggacggcggcggggccgcgggcgctgaggggccggcgccggcggggaca 179 

Qy 247 ctgagccccgcgcccctcttcagccccggcacctacgagcgcctggcgctgctqctqqqc 306 

nh 1an IMMMMIIMIIIIIII ( I '.' 1 1 M 1 1 1 1 [ 1 1 > 1 1 1 1 1 1 J 1 1 1 1 1 1 1 ; 1 1 

do 180 ctgagccccgcgcccctcttcagccccggcacctacgagcgcctggcgctgctgctgggc 239 
Qy 307 tccattgggctgctgggcgtcggcaacaacctgctggtgctcgtcctctactacaagttc 366 

nh I M 1 1 1 1 M I II 1 1 1 M M I M M I II I II II 1 1 1 1 II II II II 1 1 1 1 II M I 

240 tccatt 9ggctgctgggcgtcggcaacaacctgctggtgctcgtcctctactacaagttc 299 

Qy 367 cagcggctccgcactcccactcacctcctcctggtcaacatcagcctcagcgacctgctq 426 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I | | | | | 

Db 300 cagcggctccgcactcccactcacctcctcctggtcaacatcagcctcagcgacctgctg 359 

Qy 427 gtgtccctcttcggggtcacctttaccttcgtgtcctgcctgaggaacqgctqqqtqtqq 486 

, en IIIIMIIIIIIIMIIIIIIIIIIIIMIMIIIIMMIMIIIIIIIIIIIIIIIII 

360 gtgtccctcttcggggtcacctttaccttcgtgtcctgcctgaggaacggctgggtgtgg 419 

Qy 487 gacaccgtgggctgcgtgtgggacgggtttagcggcagcctcttcgggattgtttccatt 54 6 

nK I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | 

420 9acaccgtgggctgcgtgtgggacgggtttagcggcagcctcttcgggattgtttccatt 479 

Qy 547 gccaccctaaccgtgctggcctatgaacgttacattcgcgtggtccatgccagagtgatc 606 

mil i Miiiimiii mimmmmimiiiiimm 

Db 480 gccaccctaaccgtgctggcctatgaacgttacattcgcgtggtccatgccagagtgatc 539 
Qy 607 aatttttcctgggcctggagggccattacctacatctggctctactcactggcgtqggca 666 

e n I HI! MMMIMMIMI IIMIIIIIIIIIIIIIIIIII 

540 aatttttcctgggcctggagggccattacctacatctggctctactcactggcgtgggca 599 
ggagcacctctcctgggatggaacaggtacatcctggacgtacacggactaggctgcact 726 

enn I MM IIIIIIIIIIIIIIIIIIIIIII || IIIIIIIMIIIIIIIIIIII 

600 ggagcacctctcctgggatggaacaggtacatcctggacgtacacggactaggctgcact 659 
Qy 727 gtggactggaaatccaaggatgccaacgattcctcctttgtgcttttcttatttcttqqc 786 

nh ccn Minimi MiimmiiiM imimimmmimmmimii 

ud 660 gtggactggaaatccaaggatgccaacgattcctcctttgtgcttttcttatttcttggc 719 



Db 



Db 

Qy 667 
Db 



Qy 787 tgcctggtggtgcccctgggtgtcatagcccattgctatggccatattctatattccatt 846 
„ lllll I IIIIIIIIIIIIIIIIIIIIIII IIIIIIIIIIIIIIIIIIIIIII I IIIMII 
720 tgcctggtggtgcccctgggtgtcatagcccattgctatggccatattctatattccatt 779 

847 cgaatgcttcgttgtgtggaagatcttcagacaattcaagtgatcaagattttaaaatat 906 

_ Ml I'll, 'Ml MIIIIIIIIIMIIIIIIIIllMlMllii 

780 cgaatgcttcgttgtgtggaagatcttcagacaattcaagtgatcaagattttaaaatat 839 



Db 

Qy 
Db 

Qy 907 gaaaagaaactggccaaaatgtgctttttaatgatattcaccttcctggtctgttqqatq 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 in, . ,r. .rr. .? 

Db 

Qy 

Db 



_ _ „ 966 

Q , n iiiiiiiiiiiiiii Mini iii milium mi 

840 gaaaagaaactggccaaaatgtgctttttaatgatattcaccttcctggtctgttggatg 899 
967 ccttatatcgtgatctgcttcttggtggttaatggtcatggtcacctggtcactccaaca 1026 

_ MMIMIIIIMIIIMIM 1 1 1 1 1 1 1 1 1 1 1 N 1 1 II 1 1 1 1 1 1 M I M M 1 1 1 1 

900 ccttatatcgtgatctgcttcttggtggttaatggtcatggtcacctggtcactccaaca 959 




Qy 


1027 


atatctattgtttcgtacctctttgctaaatcgaacactgtatacaatccagtgatttat 

iiiiiiiiiiiiiiiiiMiiiMiiiiiiiimiiiiiiiiiiiiiiiiiimiiii 


1086 


Db 


960 


atatctattgtttcgtacctctttgctaaatcgaacactgtatacaatccagtgatttat 


1019 


Qy 


1087 


gtcttcatgatcagaaagtttcgaagatcccttttgcagcttctgtgcctccqactgctg 

IIIMIIilllMIIIIMIIIMIIIIIIIIIIIIIIIllllllliiiiiiiiiiiiii 


1 1 as 


Db 


1020 


gtcttcatgatcaqaaaqtttcqaaqatcccttttacaaettchaforrh^r«rTar'i-rr^t-rf 




Qy 


1147 


a 99tgccagaggcctgctaaaQacctaccaacaactaaaaataaaai-ar'3crai-r'anar'nr' 

iHMMmiMiiiiiiimiiiiimiiiiimiiiiiimiiiiiiiiiiii 


J. z ut» 


Db 


1080 


^gtsccagaggcctgctaaagacctaccagcagctggaagtgaaatgcagatcagaccc 


1139 


Qy 


1207 


attgtgatgtcacagaaagatggggacaggccaaagaaaaaagtgactttcaactcttct 

IHMIIMIIIIIIIIIIIIIIIIIIIIIIlllllllllliiiiiiiiiiiiiiiiMi 


1266 


Db 


1140 


attgtgatgtcacagaaagatggggacaggccaaagaaaaaagtgactttcaactcttct 


1199 


Qy 


1267 


tccatcatttttatcatcaccacitcratQaatcactatcaattaarasraanciarpiaaar'r' 

IIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMllliiiiuiii 


1 79£ 


Db 


1200 


tccatcatttttatcatcaccagtgatgaatcactgtcagttgacgacagcgacaaaacc 


1259 


Qy 


1327 


aatgggtccaaagttgatqtaatccaaqttcqtcctttataaaaataaAaaahcicrrs^r'rr 

lllllllll Illlllllllllllll Illllllllll I 


IJOD 


Db 


1260 


aatgggtccaaagttgatgtaatccaagttcgtcctttgtaggaatgaagaatggcaacg 




Qy 


1387 


aaagatgggqccttaaattqqatqccacttttqqactttcatrafaaaspCT^rit-ni-rTrraa 

IIIMIIIIIIIIIMIIMIIIIM Illllllllllllllllllllllllllll 


i a a a 


Db 


1320 


aaaqatqqqqccttaaattgaatqccactf ft~ aaar^ fr , ahr , aha aria art*- rr*-r-*- ^1-^=1 ^ 


1379 


Qy 


1447 


tacccgttctatgtaatatcaacagaaccttgtggtccagcaggaaatccgaattgccca 

IMIIIIIIIIIMIIIIIIIIIIIMMIIIIMIIMIIIIIMIII HIM 


1506 


Db 


1380 


tacccgttctatgtaatatcaacagaaccttgtggtccagcaggaaatccgaattgccca 


1439 


Qy 


1507 


tatgctcttgggcctcaggaagaggttgaac 1537 

MIMIIIIIIIIIMIIIIMIIIMIMI 




Db 


1440 


tatgctcttgggcctcaggaagaggttgaac 14 70 





SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Descripti 



C 1 


422.4 


27 


.5 


5000 


3 


US-09-147 


-522-5 


Sequence 


5, Appli 


2 


107.6 


7 


.0 


3016 


1 


US-07-805 


-123C-1 


Sequence 


1, Appli 


3 


107.6 


7 


.0 


3016 


1 


US-08-033 


-081B-1 


Sequence 


1, Appli 


4 


78 


5 


.1 


1105 


2 


US-08-466 


-103A-15 


Sequence 


15, Appl 


5 


73 


4 


.7 


1410 


4 


US-09-255 


-368-1 


Sequence 


1, Appli 


6 


69.2 


4 


.5 


1420 


1 


US-08-358 


-171-1 


Sequence 


1, Appli 


7 


69.2 


4 


.5 


1420 


3 


US-09-090 


-947-1 


Sequence 


1, Appli 


8 


67.2 


4 


.4 


1293 


4 


US-09-255 


-368-7 


Sequence 


7, Appli 


9 


57 


4 


.4 


1776 


1 


US-08-722 


-001-29 


Sequence 


29, Appl 


10 


65.4 


4 


.3 


2140 


1 


US-08-334 


-698-1 


Sequence 


1, Appli 


11 


65.4 


4 


.3 


2140 


1 


US-08-228 


-932-1 


Sequence 


1 , Appl i 


12 


65.4 


4 


.3 


2140 


1 


US-08-468 


-939-1 


Sequence 


1, Appli 


Result 




Query 














No. 


Score 


Match 


Length 


DB 


ID 




Description 


1 


660.8 


43 


0 


789 


10 


BI818538 




BI818538 


603033059 


2 


644.8 


42 


0 


770 


10 


BI260681 




BI260681 


602968193 


3 


609.8 


39. 


7 


909 


10 


BE894106 




BE894106 


601438234 


4 


580.8 


37. 


8 


736 


10 


BI086726 




BI086726 


602850078 


5 


577.2 


37. 


6 


835 


10 


BF970560 




BF970560 


602274056 


6 


575 


37. 


4 


850 


10 


BI757207 




BI757207 


603030709 


7 


565.4 


36. 


8 


748 


10 


BG252201 




BG252201 


602365072 


8 


515.8 


33. 


6 


741 


10 


BG564220 




BG564220 


602586010 


9 


467.8 


30. 


4 


788 


10 


BF977798 




BF977798 


602148633 


10 


461.8 


30. 


0 


784 


10 


BI758685 




BI758685 


603024224 



11 


426.8 


27 


8 


631 


9 


BB640431 


BB640431 BB640431 


12 


426 


27 


7 


819 


10 


BI088684 


BI088684 602851458 


13 


423 


27 


5 


424 


10 


BM194008 


BM194008 TCAAP1E64 


14 


406.8 


26 


5 


742 


10 


BI257225 


BI257225 602976885 


15 


398.4 


25 


9 


615 


10 


BF132059 


BF132059 601821062 



SEQ ID NO: 2 

SUMMARIES 

% 



Result 
No 


Score 


Query- 
Match 


Length DB 


ID 


Description 


1 


2117 


100 


.0 


402 


21 


AAB12827 


Human 17 723 recept 


2 


2117 


100 


.0 


402 


21 


AAY98008 


Human G- protein co 


3 


2105 


99 


.4 


402 


22 


AAE12070 


Dendritic cell (DC 


4 


1063 


50 


.2 


199 


22 


AAB64743 


Human secreted pro 


5 


756 


35 


.7 


147 


21 


AAY32195 


Human receptor mol 


6 


664 


31 


.4 


163 


22 


AAE12071 


Dendritic cell (DC 


7 


664 


31 


4 


879 


22 


AAU31008 


Novel human secret 


8 


572 


27 


0 


123 


20 


AAY60172 


Human endometrium 


9 


564 


26 


6 


122 


21 


AAB38327 


Human secreted pro 


10 


459.5 


21 


7 


349 


10 


AAP90554 


Bovine rhodopsin. 


11 


455 


21 


5 


348 


17 


AAR93116 


Rhodopsin. Homo s 


12 


451 


21 


3 


348 


21 


AAY98009 


Human rhodopsin re 


13 


449 


21 


2 


348 


14 


AAR38483 


Rhodopsin protein. 


14 


424 


20 


0 


354 


21 


AAY57086 


Rhodopsin amino ac 


15 


420.5 


19 


9 


309 


15 


AAR48735 


G-protein coupled 


16 


420.5 


19 


9 


309 


17 


AAW02707 


G-protein coupled 



RESULT 1 




AAB12827 




ID 


AAB12 82 7 standard; Protein; 4 02 AA. 




XX 






AC 


AAB12827; 




XX 






DT 


05-DEC-2000 (first entry) 




XX 






DE 


Human 17723 receptor protein SEQ ID NO:l. 




XX 






KW 


Human; 17723 receptor protein; chromosome lq42-44; diagnosis; 


vaccine ; 


KW 


G-protein coupled receptor; gene therapy. 




XX 






OS 


Homo sapiens . 




XX 






PN 


WO200043513-A1. 




XX 






PD 


27-JUL-2000. 




XX 






PF 


21-JAN-2000; 2000WO-US01592 . 




XX 






PR 


21-JAN-1999; 99US- 0234923 , 




XX 






PA 


(MILL- ) MILLENNIUM PHARM INC. 




XX 






PI 


Glucksmann MA; 




XX 






DR 


WPI; 2000-476196/41. 




DR 


N-PSDB; AAA73212. 




XX 






PT 


A G-protein- coupled receptor designated 17723 and the nucleic 


acids 


PT 


that encode it, useful for preventing, diagnosing and treating disorder 


PT 


associated with inappropriate expression of 17723 receptors - 




XX 




PS 


Claim l; Page 70-72; 79pp ; English. 




XX 






cc 


The present sequence is the human 17723 receptor protein (I) , 


which 



CC belongs to the superfatnily of G-protein-coupled receptors. (I) and the 

CC polynucleotide encoding it may be used in the prevention, treatment and 

CC diagnosis of diseases associated with inappropriate 17723 receptor 

CC expression. They may also be used to study the expression and function 

CC of 17723 receptor polypeptides and their role in metabolism. The 17723 

CC receptor polypeptides may be used as antigens in the production of 

CC antibodies against 17723 receptors and in assays to identify modulators 

CC (agonists and antagonists) of 17723 receptor expression and activity. 

CC The ant i- 17723 receptor antibodies and 17723 receptor antagonists may be 

CC used to down regulate 17723 receptor expression and activity. The 

CC anti-17723 receptor antibodies may also be used as diagnostic agents for 

CC detecting the presence of 17723 receptor polypeptides in samples 

CC {e.g. by enzyme linked immunosorbent assay (ELISA) ) . The 17723 receptor 

CC protein has been mapped to chromosome lq42-44. 

XX 

SQ Sequence 4 02 AA; 



Query Match 100.0%; Score 2117; DB 21; Length 402; 

Best Local Similarity 100.0%; Pred. No. 6.3e-222; 



Matches 


402; Conservative 0; Mismatches 0; Indels 0; Gaps 


Qy 


1 


MYSGNRSGGHGYWDGGGAAGAEGPAPAGTLSPAPLFSPGTYERLALLLGSIGLLGVGNNL 


60 


Db 


1 


IIIIMIIIIIMIMIIIMIIIIIIIIIIIMIIIIIIIIIIIIIIIIIMlllllll 

my sgnr sgghgywdgggaagaegpapagt 1 spapl f spgt yerlall Igs igl lgvgnnl 


60 


Qy 


61 


LVLVLYYKFQRLRTPTHLLLVNISLSDLLVSLFGVTFTFVSCLRNGWVWDTVGCVWDGFS 


120 


Db 


61 


IMMIIIMIIIIIIIIIIIIIIIIIIIMIMIIIIIIIIIIIIIIIIIIIMIMII 

lvlvlyykfqrlrtpthlllvnislsdllvslfgvtftfvsclrngwvwdtvgcvwdgfs 


120 


Qy 


121 


GSLFGIVSIATLTVLAYERYIRWHARVINFSWAWRAITYIWLYSLAWAGAPLLGWNRYI 


180 


Db 


121 


IIIIMIIMIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIMIimillM 

gslfgivsiatltvlayeryirwharvinf swawraityiwlyslawagapllgwnryi 


180 


Qy 


181 


LD VHGLG C T VDWKS KD AND S S F VLFL FLGCL W P LG V I AH C YGH I L YS I RMLRC VED LQT 


240 


Db 


181 


IIIIIMIIIM IMMIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIII 

ldvhglgctvdwkskdandssfvlflflgclwplgviahcyghilysirmlrcvedlqt 


240 


Qy 


241 


I Q V I K I L K YE KKLAKMC F LM I FT FL VCWM P Y I V I C F L WNGHGH L VT PT I S I VS YL F AKS 


300 


Db 


241 


!II!I!IMMIIIIIIIIIIIIIIIMIIIIIMIIIIIIIIIIIIIIIIIMIMIII 

lqvikilkyekklakmcf lmif tf lvcwmpyivicf lwnghghlvtptisivsylf aks 


300 


Qy 


301 


NTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRPAKDLPAAGSEMQIRPIVMSQKDGDRP 

IIIIIIIIMMIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIMIIM 


360 


Db 


301 


ntvynpviyvfmirkfrrsllqllclrllrcqrpakdlpaagsemqirpivmsqkdgdrp 


360 


Qy 


361 


KKKVTFNSSSI I FI ITSDESLS VDDSDKTNGSKVDVIQVRPL 4 02 

iiMimiiiiiiiiiiiiimiiiMiiimiiiMii 




Db 


361 


kkkvt fnsssiifiitsdesl s vdds dkt ngskvdv i qvrp 1 4 02 





RESULT 2 
AAY98008 

ID AAY98008 standard; Protein; 4 02 AA. 
XX 

AC AAY98008; 
XX 

DT 31-AUG-2000 (first entry) 
XX 

DE Human G-protein coupled receptor, HG51. 
XX 

KW Human; G-protein coupled receptor; HG51; signal transduction; 
KW rhodopsin receptor; obesity; type II diabetes; 

KW inflammatory bowel disease; constipation; diarrhoea; gene therapy. 
XX 

OS Homo sapiens. 
XX 

PN WO200031108-A1. 
XX 

PD 02-JUN-2000. 
XX 



PF 18-NOV-1999; 99WO-US27305 . 
XX 

PR 24-NOV-1998; 98US- 0109717 . 
XX 

PA (MERI ) MERCK & CO INC. 
XX 

PI Liu Q, McDonald TP; 
XX 

DR WPI; 2000-400025/34. 

DR N-PSDB; AAA38861. 
XX 

PT New DNA encoding human HG51 (a G-protein coupled receptor) , useful in 

PT chromosomal mapping studies for identifying the chromosomal locations 

PT of the HG51 gene(s) - 
XX 

PS Claim 23; Fig 2; 68pp ; English. 
XX 

CC G protein-coupled receptors (GPCR) are important in signal transduction 

CC from the exterior to the interior of cells. Rhodopsin receptors are a 

CC type of GPCR which comprise a chromophore -binding pocket which is 

CC covalently linked by a protonated Schiff base to a Lys residue in 

CC transmembrane domain 7. The present sequence is the human HG51 GPCR and 

CC is a member of the rhodopsin receptor family of GPCRs. Due to the Lys 

CC residue and Schiff base present in HG51, it is thought that the HG51 

CC ligand may be a fatty-acid-like molecule. It is also believed that 

CC agonists and antagonists of HG51 are useful for treating various 

CC disorders ^ such as obesity, type II diabetes, inflammatory bowel disease, 

CC constipation or diarrhoea. In addition, the coding sequence for the 

CC present sequence may be used in gene therapy for the above mentioned 

CC disorders . 

XX 

SQ Sequence 4 02 AA; 



Query Match 100.0%; Score 2117; DB 21; Length 402; 

Best Local Similarity 100.0%; Pred. No. 6.3e-222; 



Matches 


402; Conservative 0; Mismatches 0; Indels 0; Gaps 


Qy 


i 


MYSGNRSGGHGYWDGGGAAGAEGPAPAGTLSPAPLFSPGTYERLALLLGSIGLLGVGNNL 

IMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIllMIIIIIIIIIIIMIIIMII 

mysgnrsgghgywdgggaagaegpapagtlspaplfspgtyerlalllgsigllgvgnnl 


60 


Db 


i 


60 


Qy 


61 


LVLVLYYKFQRLRTPTHLLLVNISLSDLLVSLFGVTFTFVSCLRNGWVWDTVGCVWDGFS 


120 


Db 


61 


IIIIIIMIIMIIIMIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIMI 

lvlvlyykfqrlrtpthlllvnislsdllvslfgvtftfvsclrngwvwdtvgcvwdgfs 


120 


Qy 


121 


GSLFGIVSIATLTVLAYERYIRWHARVINFSWAWRAITYIWLYSLAWAGAPLLGWNRYI 


180 


Db 


121 


MIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIlllllliiiiM 

gslfgivsiatltvlayeryirwharvinfswawraityiwlyslawagapllgwnryi 


180 


Qy 


181 


LDVHGLGCTVDWKSKDANDSSFVLFLFLGCLWPLGVIAHCYGHILYSIRMLRCVEDLQT 

IIIIMIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIII 


240 


Db 


181 


ldvhglgctvdwkskdandssfvlflflgclwplgviahcyghilysirmlrcvedlqt 


240 


Qy 


241 


I Q VI KI LK YE KKLAKMC FLMI FT F L VCWM P Y I V I C FLWNGHGH L VT P T I S I VS YL F AKS 


300 


Db 


241 


!IIMIIIMIIINIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIII 

lqvikilkyekklakmcflmiftflvcwmpyivicf lwnghghlvtptisivsylf aks 


300 


Qy 


301 


NTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRPAKDLPAAGSEMQIRPIVMSQKDGDRP 

IMIIIIjllllllllllllll|||||||||||MIIIIIIIIIIIIIII|||||||||| 


360 


Db 


301 


ntvynpviyvfmirkfrrsllqllclrllrcqrpakdlpaagsemqirpivmsqkdgdrp 


360 


Qy 


361 


KKKVTFNSSSIIFIITSDESLSVDDSDKTNGSKVDVIQVRPL 402 

MIMIIIMIIIimillllllllllllllllMMliii 




Db 


361 


kkkvt fnsssiifiit sde s 1 s vddsdktngskvdviqvrpl 402 





RESULT 3 
AAE12070 

ID AAE12070 standard; Protein; 4 02 AA. 
XX 



AC AAE12 070; 
XX 

DT 18-DEC-2001 (first entry) 
XX 

DE Dendritic cell (DC) DCEPR protein. 
XX 

KW Dendritic cell; DC; DCEPR protein; gene therapy; dermatological ; vaccine 
KW atopic dermatitis; autoimmune disease; inflammatory skin disease; cancer 
KW immunosuppressive; AIDS; Acquired immune deficiency syndrome; cytostatic 
KW chromosomal identification; pharmaceutical; hypersensitivity; virucide; 
KW transplant rejection; chronic inflammatory disease; anti-HIv' 
XX 

OS Unidentified. 
XX 

PN WO200172773-A2. 
XX 

PD 04-OCT-2001. 
XX 

PF 28-MAR-2001; 2001WO-EP03542 . 
XX 

PR 29-MAR-2000; 2000US-192934P . 
PR 18-MAY-2000; 2000US-205020P . 
PR 18-MAY-2000; 2000US-205026P . 
PR 19-MAY-2000; 2000US-205767P . 
PR 19-MAY-2000; 2000US-205769P . 
XX 

PA (NOVS ) NOVARTIS AG. 

PA (NOVS ) NOVARTIS -ERFINDUNGEN VERW GES MBH. 
XX 

PI Werner G, Phares W, Jaritz M, Lapp H, Kalthoff FS • 
XX 

DR WPI; 2001-616466/71. 

DR N-PSDB; AAD19720 . 
XX 

PT New polypeptides for screening therapeutic agonists and antagonists 

PT comprise dendritic cell polypeptides - 

XX 

PS Claim 2; Page 42; 52pp; English. 
XX 

CC The invention relates to dendritic cell (DC) proteins and their 

CC corresponding DNA molecules. A pharmaceutical composition comprising 

CC agonist and antagonist of DC proteins are useful for treating abnormal 

CC conditions related to both an excess of and insufficient level of 

CC expression of DC gene, or related to both an excess of and insufficient 

CC activity of DC protein. Soluble form of DC proteins are used as an active 

CC ingredient in combination with pharmaceutical acceptable carriers. 

CC DC genes and proteins are useful for treating chronic inflammatory 

CC diseases, autoimmune diseases, transplant rejection crisis, including 

CC inflammatory skin diseases such as contact hypersensitivity, atopic 

CC dermatitis or virally- induced immune suppression such as AIDS and cancer. 

CC DC protein is useful for inducing immunological response in a mammal, and 

CC as immunogen to produce antibodies immunospecif ic for the polypeptide. 

CC DC gene is useful in gene therapy. DC gene is also useful as a diagnostic 

CC reagent, and for chromosomal identification. The present sequence is 

CC dendritic cell (DC) DCEPR protein which is found to belong to the family 

CC of G-protein coupled receptor protein. 

XX 

SQ Sequence 4 02 AA; 



Query Match 99.4%; Score 2105; DB 22; Length 402; 

Best Local Similarity 99.5%; Pred. No. 1.3e-220; 

Matches 400; Conservative 1; Mismatches 1; Indels 0; Gaps 
Qy 1 MYSGNRSGGHGYWDGGGAAGAEGPAPAGTLSPAPLFSPGTYERLALLLGSIGLLGVGNNL 60 

i Nihil MIIIIUIIIIIIIIIIIIIIIIIIIIIMllllllMIIIIII 

1 mysgnrsgghgywdgggaagakgpapagtlspaplfspgtyerlalllgsigllgvgnnl 60 
Qy 61 LVLVLYYKFQRLRTPTHLLLVNISLSDLLVSLFGVTFTFVSCLRNGWVWDTVGCVWDGFS 120 

Dh fil IIJIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIII 

Db 61 Ivlvlyykfqrlrtpthlllvnislsdllvslfgvtftfvsclrngwvwdtvgcvwdgfs 120 



Db 



Qy 


121 


GS LFGI VS I ATLTVLAYERYI RWHARVINFS WAWRAI T Y I WL YS LAWAGAPLLGWNRYI 


180 


Db 


121 


iMiMiMiiMiiiiiiiiiiiiimiiiiiiimiiiiiiiiiiiimiiiiii 

qslf qivsiatltvlavervirwharvinf qwawra i t* vi w~\ wai a^anani i „ t ■ 


180 


Qy 


181 


LDVHGLGCTVDWKSKDANDSSFVLFLFLGCLWPLGVIAHCYGHILYSIRMLRCVEDLQT 

MIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII Illllllllllll 

ldvhglgctvdwkskdandssfvlflflgclwplgviahcyghilysirmlrcvedlqt 


240 


Db 


181 


240 


Qy 


241 


IQVIKI LKYEKKLAKMCFLM I FTFLVCWMP YI VI CFLWNGHGHLVTPTI S I VS YLFAKS 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 i l l it 1 1 . 1 1 t . . 


300 


Db 


241 


! 1 ' !'!''' 1 ' ' ' Ml I Ml M II 1 M II If 1 1 | M M | || | | M M | | | | | 

lqvikilkyekklakmcflmiftflvcwmpyivicflvvnghghlvtptisivsylfaks 300 




301 


NTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRPAKDLPAAGSEMQIRPIVMSQKDGDRP 

IIIIIIMIIIIIIIIIIMIMIIMIIMIIMIIIIIIillllMIIIIIIIIIIII 


360 


Db 


301 


ntvynpviyvfmirkfrrsllqllclrllrcqrpakdlpaagsemqirpivmsqkdgdrp 


360 


Qy 


361 


KKKVTFNSSSIIFIITSDESLSVDDSDKTNGSKVDVIQVRPL 402 




Db 


361 


Millli l! II Ml IIMIIIIIIIIIIIIIMMIIIIII 

kkkvt f ns s s i i f igt sdesl s vddsdktngskvdviqvrpl 402 





RESULT 4 
AAB64743 

ID AAB64743 standard; Protein; 199 AA. 
XX 

AC AAB64743; 
XX 

DT 23-MAR-2001 (first entry) 
XX 
DE 
XX 
KW 
KW 



CC 
CC 
CC 
CC 



CC 
CC 
CC 



Human secreted protein sequence encoded by gene 15 SEQ ID NO: 137. 

Human; secreted protein; diagnosis; cytostatic; antirheumatic ; 
antiarthritic; dermalogical ; cardiant; antiinflammatory; anti-ulcer; 



KW gastrointestinal; solid tumour; rheumatoid arthritis; psoriasis; 
diabetic retinopathy; myocardial angiogenesis ; Crohn's disease;' 



KW 

KW ulcer 
XX 

OS Homo sapiens. 
XX 

PN WO200077237-A1. 
XX 

PD 21-DEC-2000. 
XX 

PF 01-JUN-2000; 2000WO-US14928 . 
XX 

PR ll-JUN-1999; 99US-0138633 . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 

PA (ROSE/) ROSEN C A. 

XX 

PI Rosen CA, Ruben SM, Komatsoulis GA; 
XX 



DR WPI; 2001-071280/08. 
XX 

PT Nucleic acids encoding 49 human secreted polypeptides, useful for 

PT preventing, diagnosing and/or treating diseases such as tumors, 

PT rheumatoid arthritis, psoriasis and diabetic retinopathy - 

PS Disclosure; Page 503; 520pp; English. 
XX 
CC 



The polynucleotide sequences given in AAF33037 to AAF33085 encode the 
CC human secreted proteins given in AAB64666 to AAB64714. AAB64715 to 

AAB64771 represent human secreted polypeptide sequences and proteins 
homologous to them, which are given in the exemplification of the present 
invention. Human secreted proteins have activities based on the tissues 
and cells the genes are expressed in. Examples of activities include: 
CC cytostatic; antirheumatic; antiarthritic; dermalogical; cardiant; 

antiinflammatory; gastrointestinal; and anti-ulcer. The polynucleotides 
and polypeptides can be used in the prevention, treatment and diagnosis 
of diseases associated with inappropriate polypeptide expression. 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

XX 
SQ 



Disorders that may be treated or prevented include solid tumours, 
rheumatoid arthritis, psoriasis, diabetic retinopathy, myocardial 
angiogenesis, Crohn's disease and ulcers. The polynucleotides and their 
complementary sequences may also be used as DNA probes in diagnostic 
assays (e.g. polymerase chain reactions (PCR)) to detect and quantitate 
the presence of similar nucleic acid sequences in samples, and therefore 
which patients may be in need of restorative therapy. The polypeptides 
may also be used as antigens in the production of antibodies against the 
polypeptide and in assays to identify modulators (agonists and 
antagonists) of polypeptide expression and activity. The ant i- polypeptide 
antibodies and antagonists may also be used to down regulate expression 
and activity. AAF33028 to AAF33036 and AAB64665 represent sequences used 
in the exemplification of the present invention. 

Sequence 199 AA; 



Query Match 
Best Local Similarity 



iniiii 



Matches 


Qy 


118 


Db 


1 


Qy 


178 


Db 


61 


Qy 


238 


Db 


121 


Qy 


298 


Db 


181 



50.2%; Score 1063; DB 22; 
100.0%; Pred. No. 2e-107; 
tive 0; Mismatches 0; 



Length 199; 
Indels 0; Gaps 



MIIMIIIIIIIIMIIIIII 



IIMIIIIIIMIIIMI 



llllilNIIIMIIIIIIIIIIIIIIIIIIIIIIIIIMllllllliiMiiiiiiiii 



1 1 1 1 



I ! 1 1 1 1 M: !; 1 1 1 1 ; I U ,' 



Mil 



III 



SUMMARIES 



Result 
No. 



Score 



% 

Query 

Match Length DB 



ID 



Description 



1 


451 


21.3 


348 


2 


US-08-390-000A-8 


2 


444 


21.0 


348 


4 


US-08-430-286A-11 


3 


420.5 


19.9 


309 


1 


US-08-118-270-56 


4 


420,5 


19.9 


309 


5 


PCT-US93-08528-56 


5 


354 .5 


16.7 


297 


1 


US-08-118-270-58 


6 


354 .5 


16.7 


297 


5 


PCT-US93-08528-58 


7 


341.5 


16.1 


305 


1 


US-08-118-270-59 


8 


341.5 


16.1 


305 


5 


PCT-US93-08528-59 


9 


338 .5 


16.0 


297 


1 


US-08-118-270-57 


10 


338.5 


16.0 


297 


5 


PCT-US93-08528-57 


11 


309 


14.6 


391 


1 


US-07-816-283-2 


12 


309 


14.6 


391 


1 


US-08-417-103-2 


13 


309 


14.6 


391 


1 


US-08-417-103-14 


14 


304 


14 .4 


391 


1 


US-07-816-283-4 


15 


304 


14 .4 


391 


1 


US-08-417-103-4 


Result 




Query 








No. 


Score 


Match 


Length DB 


ID 


1 


477.5 


22.6 


349 


1 


JC5490 


2 


475 


22 .4 


351 


1 


A55962 


3 


464 


21.9 


352 


2 


150081 


4 


458 


21.6 


348 


1 


OOBO 


5 


456.5 


21.6 


348 


1 


JC4267 


6 


455 


21.5 


348 


1 


S23398 


7 


452 . 5 


21.4 


351 


2 


S29152 



Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 



8, Appli 
11, Appl 
56, Appl 

56, Appl 
58, Appl 

58, Appl 

59, Appl 
59, Appl 

57, Appl 
57, Appl 
2, Appli 
2, Appli 
14, Appl 
4, Appli 
4, Appli 



Description 



opsin, pineal glan 
opsin, pineal glan 
rhodopsin - green 
rhodopsin - bovine 
opsin - rabbit 
rhodopsin - Chines 
rhodopsin - chicke 



8 


451 


21 


.3 


348 


1 


OOHU 


9 


451 


21 


.3 


354 


1 


S27231 


10 


450 


21 


3 


348 


1 


A23665 


11 


448 


21 


2 


354 


1 


151200 



rhodopsin - human 
rhodopsin - northe 
opsin - mouse 
rhodopsin - Africa 



Result 



Query 



No . 


Score 


Match 


Length DB 


ID 


1 


2117 


100 


.0 


402 


1 


OPN3_HUMAN 


2 


1862 


88 


.0 


400 


1 


OPN3_MOUSE 


3 


477.5 


22 


.6 


349 


1 


OPSP_COLLI 


4 


475 


22 


.4 


351 


1 


OPSP_CHICK 


5 


468 


22 


.1 


352 


1 


OPSD_ALLMI 


6 


466.5 


22 


0 


444 


1 


OPSP_PETMA 


7 


464 


21 


9 


352 


1 


OPSD_ANOCA 


8 


458 


21 


6 


348 


1 


OPSD_BOVIN 


9 


456.5 


21 


6 


348 


1 


OPSD_RABIT 


10 


455 


21 


5 


348 


1 


OPSD_CRIGR 


11 


455 


21 


5 


348 


1 


OPSD_MACFA 


12 


455 


21 


5 


354 


1 


OPSD RANCA 



Description 



Q9hly3 
Q9wuk7 
P51476 
P51475 
P52202 
042490 
P41591 
P02699 
P49912 
P28681 
Q28886 
P51470 



homo sapien 
mus musculu 
columba liv 
gallus gall 
alligator m 
petromyzon 
anolis caro 
bos taurus 
oryctolagus 
cricetulus 
macaca fasc 
rana catesb 



RESULT 1 
0PN3_HUMAN 

ID OPN3_HUMAN STANDARD; PRT; 4 02 AA. 

AC Q9H1Y3; Q9Y344; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 01-MAR-2002 (Rel. 41, Last annotation update) 

DE Opsin 3 (Encephalopsin) (Panopsin) . 

GN 0PN3 OR ECPN. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo 
OX NCB I_TaxID= 9606; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=99252448; PubMed=10234 000 ; 
RA Blackshaw S . , Snyder S . H . ; 

RT "Encephalopsin: a novel mammalian extraretinal opsin discretely 

RT localized in the brain."; 

RL J. Neurosci. 19:3681-3690(1999). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21295039; PubMed=11401433 ; 

RA Halford S., Freedman M.S., Bellingham J., Inglis S.L., 

RA Poopalasundaram S., Soni B.G., Foster R.G., Hunt D.M.J 

RT "Characterization of a novel human opsin gene with wide tissue 

RT expression and identification of embedded and flanking genes on 

RT chromosome lq43 . " ; 

RL Genomics 72:203-208(2001). 

RN [3] 

RP SEQUENCE FROM N.A. 

RA Parker A . ; 

RL Submitted (JAN-2001) to the EMBL/GenBank/DDBJ databases. 

FUNCTION: May play a role in encephalic photoreception 
SUBCELLULAR LOCATION: Integral membrane protein. 

TISSUE SPECIFICITY: Strongly expressed in brain. Highly expressed 
CC m the preoptic area and paraventricular nucleus of the 

CC hypothalamus. Shows highly patterned expression in other regions 

CC of the brain, being enriched in selected regions of the cerebral 

CC cortex, cerebellar Purkinje cells, a subset of striatal neurons, 

CC selected thalamic nuclei, and a subset of interneurons in the 

CC ventral horn of the spinal cord. 

CC -i- SIMILARITY: BELONGS TO FAMILY 1 OF G-PROTEIN COUPLED RECEPTORS 
CC OPSIN SUBFAMILY. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 



CC 
CC 
CC 



cc 


the European Bioinf ormatics 


Institute. There are no restrictions on 


its 


cc 


use by 


non-profit institutions as long as its content is in no 


way 


cc 


modified 


and this 


statement 


is not removed. Usage by and for commercial 


cc 


entities 


requires 


a license 


agreement (See http://www.isb-sib.ch/announce/ 


rip 

cc 


or send an email 


to license@isb-sib. ch) . 




DR 


EMBL; AF140242; AAD32671 


l; 






DR 


EMBL; AF303588; AAK37447 


1; 






DR 


EMBL; AL133390; CAC19785. 


l; 






DR 


InterPro; 


IPR000276; GPCR Rhodpsn. 




UK 


Pfam; PF00001; 7tm 1; 1. 








UK 


PRINTS; PR00237; 


3PCRRHODOPSN . 




UK 


PROSITE; 


PS00237; 


G_PROTEIN_ 


_RECEP_F1_1; 1. 




nn 
UK 


PROSITE; 


PS50262; 


G PROTEIN 


_RECEP_F1_2; 1. 




UK 


PROSITE; 


PS00238; 


OPSIN; 


l. 








Photoreceptor; Retinal protein; Transmembrane; Lipoprotein; Palmitate- 






G-protein coupled receptor. 






r 1 


DOMAIN 


1 


40 




EXTRACELLULAR (POTENTIAL) . 




r 1 


TRANSMEM 


41 


65 




1 (POTENTIAL) . 




FT 


DOMAIN 


66 


77 




CYTOPLASMIC (POTENTIAL) . 




■Cirri 

r l 


TRANSMEM 


78 


102 




2 (POTENTIAL) . 




*C"T> 

r i 


DOMAIN 


103 


117 




EXTRACELLULAR (POTENTIAL) . 




c I 


TRANSMEM 


118 


137 




3 (POTENTIAL) . 




FT 


DOMAIN 


138 


153 




CYTOPLASMIC (POTENTIAL) . 




r I 


TRANSMEM 


154 


177 




4 (POTENTIAL) . 




FT 


DOMAIN 


178 


201 




EXTRACELLULAR (POTENTIAL) . 




r J. 


TRANSMEM 


202 


229 




5 (POTENTIAL) . 




FT 


DOMAIN 


230 


255 




CYTOPLASMIC (POTENTIAL) . 




FT 


TRANSMEM 


256 


279 




6 (POTENTIAL) . 




FT 


DOMAIN 


280 


287 




EXTRACELLULAR (POTENTIAL) . 




r i 


TRANSMEM 


288 


312 




7 (POTENTIAL) . 




FT 


DOMAIN 


313 


402 




CYTOPLASMIC (POTENTIAL). 




TTT 

r i 


DISULFID 


114 


188 




BY SIMILARITY. 




FT 


BINDING 


299 


299 




RETINAL CHROMOPHORE. 




FT 


LIPID 


325 


325 




PALMITATE (BY SIMILARITY) . 




FT 


CARBOHYD 


5 


5 




N- LINKED (GLCNAC . . .) (POTENTIAL). 




FT 


CARBOHYD 


198 


198 




N- LINKED (GLCNAC. . .) (POTENTIAL). 




FT 


CONFLICT 


390 


396 




NGSKVDV -> IGVQSLML (IN REF . 1). 




SQ 


SEQUENCE 


402 AA; 


44873 


MW 


; 370F64C19F834A71 CRC64 ; 





Query Match 100.0%; Score 2117; DB 1; Length 402; 

Best Local Similarity 100.0%; Pred. No. le-136; 

Matches 402; Conservative 0; Mismatches 0; Indels 0; Gaps 



Qy 


i 


MYSGNRSGGHGYWDGGGAAGAEGPAPAGTLS PAPLFS PGTYERLALLLGS IGLLGVGNNL 


60 


Db 


l 


IIMIMIIIIIIIIIinillllllMIIIIIIIIIIMIIIIIIIIIIIIIIIlMll 

MYSGNRSGGHGYWDGGGAAGAEGPAPAGTLS PAPLFS PGTYERLALLLGS IGLLGVGNNL 


60 


Qy 


61 


L VL VL YYKFQR LRT PTH L LL VN I S L S D LL VS L FG VT FT F VS CLRNGWVWDT VG C VWDG F S 


120 


Db 


61 


IIIIIIIIIIIMIIIIIIIMIIIIIIIIIIIIIIMIIIIIIIMIIIIMIIIIIII 

LVLVLYYKFQRLRTPTHLLLVNISLSDLLVSLFGVTFTFVSCLRNGWVWDTVGCVWDGFS 


120 


Qy 


121 


G S L FG I VS I ATLT VliA YE R Y I R WHAR V I NF S WAWRA I T Y I WL YS LAWAG AP LLG WNR Y I 


180 


Db 


121 


IIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIlllllllllllMMIIIIIIIIIIIII 

G S L FG I VS I ATLT VLA YE R Y I R WHAR V I NF S WAWRAI T Y I WL YS LAWAG APLLG WNR Y I 


180 


Qy 


181 


LDVHGLGCTVDWKSKDANDSSFVLFLFLGCLWPLGVIAHCYGHILYSIRMLRCVEDLQT 

IIIMIIIMMIIIIIIIIIMIIIIMIMIMIMIIIIIIIIIIIIIIIIIlllll 


240 


Db 


181 


LDVHGLGCTVDWKSKDANDSSFVLFLFLGCLWPLGVIAHCYGHILYSIRMLRCVEDLQT 


240 


Qy 


241 


IQVIKILKYEKKLAKMCFLMIFTFLVCWMPYIVICFLWNGHGHLVTPTISIVSYLFAKS 

IIIIIIMIMIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIMIIMMIIIIIIII 

I QVI K I L K YE KKLAKMC FLMI FT FLVCWM P Y I V I CF L WNGHGHL VT PTISIVSYLFAKS 


300 


Db 


241 


300 


Qy 


301 


NTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRPAKDLPAAGSEMQIRPIVMSQKDGDRP 

IIIIIIIIMIIMIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIII 


360 


Db 


301 


NTVYNPVIYVFMIRKFRRSLLQLLCLRLLRCQRPAKDLPAAGSEMQIRPIVMSQKDGDRP 


360 


Qy 


361 


KKKVTFNSSSIIFIITSDESLSVDDSDKTNGSKVDVIQVRPL 402 





iiiiiiiMimiiiiiiiiiimiiiiiiiiiiiiiiii 



361 KKKVTFNSSSIIFIITSDESLSVDDSDKTNGSKVDVIQVRPIj 402 



Result 




Query- 








No. 


Score 


Match 


Length DB 


ID 


1 


500 . 5 


23 


.6 


352 


13 




2 


491.5 


23 


.2 


534 


13 


057422 


3 


484.5 


22 


.9 


346 


13 


Q9PUA9 


4 


480 


22 


.7 


357 


13 


Q9IBH2 


5 


473.5 


22 


4 


377 


13 


Q9IB88 


6 


473 


22 


3 


543 


13 


Q90YK6 


7 


458 


21 


6 


348 


6 


Q95KU1 


8 


457.5 


21 


6 


351 


13 


Q9IA36 


9 


455.5 


21 


5 


351 


13 


Q9W6S0 


10 


455 


21 


5 


363 


13 


Q98TH3 


11 


453 .5 


21 


4 


322 


13 


057448 



Description 



Q9w6k3 
057422 
Q9pua9 
Q9ibh2 
Q9ib88 
Q90yk6 
Q95kul 
Q9ia36 
Q9w6s0 
Q98th3 
057448 



anolis caro 
xenopus lae 
bufo japoni 
phelsuma ma 
brachydanio 
gallus gall 
felis silve 
poephila gu 
columba liv 
cynops pyrr 
anas platyr 



